Task Force Lima preps new space for generative AI experimentation

The task force commander provided DefenseScoop with an exclusive preview of his team's plans to unleash an experimental sandbox.
Task Force Lima military patch (CDAO image)

The Pentagon’s Task Force Lima team is getting set to launch a new “virtual sandbox” hub where military personnel will be able to responsibly experiment with approved generative artificial intelligence tools that hold potential to enhance their work.

“We already have the plans. It’s a matter, now, of executing on those plans,” Navy Capt. M. Xavier Lugo recently told DefenseScoop.

Lugo was tapped as Lima’s commander when Deputy Defense Secretary Kathleen Hicks established the temporary task force in August 2023 within the Chief Digital and AI Office (CDAO) to ultimately help the Department of Defense assess, synchronize and employ generative AI, which is broadly associated with large language models that generate (convincing but not always accurate) text, media and software code based on human prompts. Hicks set an 18-month deadline by which Lugo and his Lima team are expected to produce materials, resources and a path forward to guide DOD’s approach to unleashing this emerging technology.

Generative AI applications are already showing much promise for military functions, but Lugo and other experts also acknowledge that the tech could pose serious risks that are still far from fully realized.

During a recent virtual panel hosted by the Center for Strategic and International Studies, Lugo discussed the nearly 230 potential generative AI use cases that have been submitted to his team and are now being explored for and by the DOD. He also shed light on the CDAO’s new “Alpha-I” funding line and portfolio that’s now also under his purview as a division chief for AI scaffolding and integration in the office’s algorithmic warfare group.


In an exclusive interview with DefenseScoop after that CSIS event, Lugo explained more about DOD’s vision for Alpha-I and that sandbox-enabled experimentation — and he also reflected on Task Force Lima’s learnings to date and future plans.

“In this generation, we are the generative AI pioneers. We’re still not the generative AI citizens. So, we’ve got to think about how those citizens are going to be dealing with this,” he told DefenseScoop.   

‘More than just notional’

A longtime Navy officer with extensive experience in the Supply Corps, Capt. Lugo is also a mechanical engineer by degree, and he’s been a coder since high school.

“There are three ways of using [generative AI] technology from my perspective, right now,” he said in the interview. 


The first involves generating and summarizing text and documents, which Lugo recognizes sounds boring but is very useful for military personnel.

“If you have thousands of documents and you assign your youngest officer to go and summarize those, it’s going to take a length of time. If you assign [a large language model] to do it, it’s going to take very, very, very little time to do it. Right. Now, the problem is the quality of the output,” Lugo said. 

So, humans would need to verify that the information provided by the technology is correct.

Lugo noted that the second generative AI use case category for DOD right now essentially enables staff to interrogate and analyze applicable data.

“We’ve started with it and there’s more to go. But that one is something that, in the maintenance side of the house, we’ve already done. So the Air Force has already connected all their aircraft [and] data — and you, as a maintainer, can go in and say, ‘I want to see last week’s performance of this particular pump,’” the Lima chief explained.


Personnel can then specify the types of graphs and resources they’d like to see to visualize the department’s data.

“And number three — which is the one that I’m most excited about, and we’ll probably get into in the future — [involves code generation] and conversing with your machines. And that’s the one where you can actually, as a fighter pilot, imagine if you could just tell your display what you want to see — versus having to go through all the menus and to customize your display,” Lugo said. 

Task Force Lima has only just started to dig into this third bucket of code-making generative AI.

“But we are utilizing it with humans in the loop, and it’s more than just notional,” Lugo confirmed. “I won’t tell you where. But I can tell you it is being utilized.”

At the same time though, when it comes to these particular types of use cases, humans still currently “lack imagination,” in his view.


“We haven’t thought of how we’re going to be interacting with machines as if they were our partners for assistance, that has not come into play yet,” Lugo said.

Since day one, his team on the task force has moved cautiously and deliberately so as to not overlook any unforeseen consequences for military generative AI experimentation.  

“Right now we have the luxury of having [human subject matter experts] to check this. And those SMEs grow and through the process of the growing pains of having to do all that work and getting there. If we are using a computer — we are using a technology that doesn’t let us think that way — are we going to still have SMEs in a generation or two from now? We’ve got to be very careful and we’ve got to think philosophically as to how we’re going to implement these machines or this technology. And that doesn’t escape me,” Lugo told DefenseScoop. “And that’s part of why, when we think about these use cases, we come up with the potential negatives or risks associated with that in the human side of the house. We’ve got to be careful.”

A safe place to play

During the CSIS discussion, Lugo and his colleague Col. Matthew Strohmeyer broadly went over how DOD’s ninth Global Information Dominance Experiment recently served as a unique venue for military-supporting generative AI exploration.


In the exclusive interview alongside Lugo after that panel, Strohmeyer — the CDAO’s Combined Joint All-Domain Command and Control (CJADC2) experimentation division chief — further spotlighted some of the large language model-aligned use cases that officials have pursued amid GIDE, so far.

“One is for intelligence workflows. We’ve used it to be able to gain a better understanding of how both the operational environment, as we call it, is changing in a specific area — and then how a competitor might be changing the actions they are taking,” he explained.

Other GIDE-specific use cases were associated with helping military officials in planning out activities and, separately, for puzzling out options for logistics workflows.

“We’ve really started to strongly partner with the algorithmic warfare directorate, where Task Force Lima sits. And so we are going to be doing even more — especially now that we’ve got the [fiscal 2024] appropriations bill — we’re going to be doing even more experimentation because the warfighters have really found a lot of value to it,” Strohmeyer told DefenseScoop.

Lugo chimed in: “And this is where the sandbox is coming in.”


The task force commander explained that, although it’s incredibly important “because it’s mostly associated with [connecting] the decision advantage pieces within the combatant commands,” GIDE marks one of many ways his team helps the military experiment with generative AI use cases.

“This is where I say ‘experiment with purpose.’ I don’t experiment just to figure out if this thing is sentient, right — I’m not doing any of that stuff. That’s research that universities do. Our experimentation is really about experimenting with how to make it fit into a workflow,” Lugo said.

The task force suggests some of those possible applications to different DOD offices and teams, based on learnings they gain from conferences, industry engagements and other research. 

“We’ve become a little bit more of a consultant in some of those cases — but now with the sandboxes, we can enable them to play around more,” Lugo noted. “You need a place to play that is safe.”

The senior CDAO official offered an analogy that he uses to explain what the sandbox hub will ultimately provide. In it, he divides the world into three parts: the “wild,” or everything external to the Pentagon; the “zoo,” or everything inside DOD; and then the “cages,” where department insiders can “play around” with large language models in a way that they know is safe and secure, meeting government standards.


“Just like any experiment, you’ve got a hypothesis saying this is going to help us in this aspect, this aspect, and this aspect. And all you’re doing is either proving or disproving that hypothesis,” Lugo said. “But now it’s no longer just an academic exercise — it is actual [DOD] data.”

Though he couldn’t provide a precise timeline for when the CDAO’s new generative AI experimental sandboxes may fully come into fruition and be deployed for widespread use, the task force chief predicted it would be “soon.”

Lugo shared that the office will gain new investments and resources from the recently passed fiscal 2024 appropriations bill to enable the sandbox hub. 

This work falls under the new “Alpha-I” portfolio and budget line within the CDAO.

“I mean, there’s still a lag between that [appropriations] document and then when you can actually execute on that,” Lugo said.


There are also a wide range of other projects, tests and policy-shaping priorities the Lima team is currently pursuing. But at this point, in Lugo’s eyes, it’s too soon to tell if there will be a need for the temporary task force to evolve into a permanent DOD entity down the line.  

“One of my tasks, at the end of the day, is [figuring out] what transitions out of the task force and where should it go? So let’s say just, one of our activities is developing cybersecurity documents. Well, that could transition to perhaps CIO — so we point to where it could go into and we make sure that they can accept it. And then once that’s done, then I can call it ‘mission complete.’ I will not call ‘mission complete’ until all tasks are either completed or transitioned. That’s the goal,” Lugo told DefenseScoop.

Brandi Vincent

Written by Brandi Vincent

Brandi Vincent is DefenseScoop's Pentagon correspondent. She reports on emerging and disruptive technologies, and associated policies, impacting the Defense Department and its personnel. Prior to joining Scoop News Group, Brandi produced a long-form documentary and worked as a journalist at Nextgov, Snapchat and NBC Network. She was named a 2021 Paul Miller Washington Fellow by the National Press Foundation and was awarded SIIA’s 2020 Jesse H. Neal Award for Best News Coverage. Brandi grew up in Louisiana and received a master’s degree in journalism from the University of Maryland.

Latest Podcasts