Advertisement

A first look at IBM’s new large language model that’s fine-tuned for defense applications

The new IBM Defense Model combines advanced AI with domain-specific data from Janes to help improve users’ decision-making.
Listen to this article
0:00
Learn more. This feature uses an automated voice, which may result in occasional errors in pronunciation, tone, or sentiment.
The IBM logo at the headquarters of IBM Germany in the Highlight Towers in Parkstadt Schwabing in Munich (Bavaria). (Photo by Matthias Balk/picture alliance via Getty Images)

IBM is set to launch a new large language model that’s purpose-built for defense and national security applications and trained on data from open-source intelligence provider, Janes.

Senior officials from each of the companies gave DefenseScoop an exclusive preview of the IBM Defense Model ahead of its initial public rollout on Wednesday.

The tool “really understands defense terminology, equipment, standards and mission context. The model is uniquely able to be deployed in defense environments and have an immediate impact,” said Vanessa Hunt, technology general manager for U.S. Federal Market at IBM.

LLMs — and the overarching field of generative AI — involve disruptive technologies that can produce convincing but not always accurate text, software code, images and other media, based on prompts from humans.

Advertisement

This rapidly evolving realm holds a lot of promise for Defense Department and military users, but also poses unknown and serious potential challenges. Pentagon leadership is investing heavily to accelerate its enterprise-wide adoption of some of the most advanced commercial algorithms and LLM capabilities.

IBM’s new defense-specific LLM offering is built on the company’s Granite foundation models and deployable in air-gapped, classified, and edge settings.

“There’s a lot of interest, more interest than I think I’ve ever seen in a new product,” Ben Conklin, who leads innovation at Janes, told DefenseScoop. “And generally, we kind of look at the two primary use cases on this first version of the model as being either supporting operational planning and intelligence functions within the military, but also within the defense industrial base. There’s corporate strategy and planning departments that have to consider their future strategies, what kind of equipment they’re going to build, and what capabilities they need.”

Users will be able to connect this tool into their secure environments via an application programming interface or API, which typically facilitates communication and interaction for LLMs.

“Like, if somebody’s the integrator for CJADC2 or Maven — or anything like that — they could integrate this into that system,” Conklin said, referring to DOD’s plan for Combined Joint All-Domain Command and Control and the Maven Smart System that underpins one the department’s sprawling AI initiatives.

Advertisement

Janes is the primary source of data that powers the IBM Defense Model.

“Generally, I’d say it’s publicly available information [that’s] lawfully obtained. What I mean by that is we get it directly from military equipment manufacturers, government public statements, other things like that. And in that work, our analysts and our experts collate that data into a structured dataset,” Conklin explained. 

He pointed to an air show in Asia as a “simple example.” If someone from Janes attended, they would take photographs, speak to manufacturers, and collect information about the equipment on display. That intel would then get reported into a database, and then the model could learn from it to help inform military planners.

“And that’s … for every military in the world,” Conklin said. “Any military, any country that builds military equipment and exports it, would share that kind of information with Janes.”

Hunt emphasized that Janes’ data for the LLM is continuously being refreshed.

Advertisement

“And those updates are delivered through secure feeds on a scheduled basis, so the model remains current without compromising security in any way,” she said.

One notable element that makes this LLM “drastically different” from the other general-purpose models (including those DOD is already likely tapping into), according to both officials, is that it’s not trained from the get-go on the internet — but instead, on carefully curated information that’s vetted by humans at Janes.

“The internet has a lot of information about the military, and most of it’s wrong,” Conklin said. “So, if you ask it something about a piece of equipment, you’re more likely to get the wrong answer than the right answer. If you come to this precise model, you’ll get the right answer.”

That data, which is updated all the time, is delivered separately from the model. Rather than memorize every fact from Janes, the LLM queries the data and poses questions to understand it.

“The world is changing all the time, and things like drones and other things are changing on a daily basis. So that data could be updated, and then the model would actually just ask questions of the live data. So, that’s a really good way to build a system that can be updated. Of course, the model will be improved,” Conklin said. 

Advertisement

The companies have opted for a subscription-based pricing model that is flexible to meet the needs of various customers and also enable those continuous updates.

“It is definitely something you would subscribe to. But we want it to plug into systems that are probably around for the next 20 years, and so there’ll be work that we do with different integrators to make that happen. So it’s not like it’s a one-size-fits-all, but we’ll have a baseline approach to it that I think matches the value you get in subscription,” Conklin confirmed.

The model’s makers foresee use cases expanding dramatically in the near term, because the technology can run in a customer’s own environment and train with their data, as well as all the intel from Janes. The two officials told DefenseScoop that they expect to see some of the first implementations of the LLM within the next few months.

“We’ve had a lot of early engagement to get to understand the customer requirements and to hear what they really need. So, we’re pretty confident that we have a set of key use cases we can deliver out of the gate. And of course, we have a whole product roadmap that we’re going to work on to improve,” Conklin noted.

Roots of this partnership trace back to a little more than a year ago when officials from the two businesses connected at an event in Virginia, where they were presenting to NATO members.

Advertisement

“I think it, from my perspective, is a natural fit — because IBM is, as far as I’m aware, is one of the only model producers who build a model that they can identify all the data that went into it,” Conklin said. “So, they had this great Granite model to start with, and then they could add our data to it and keep that integrity intact.”

He and Hunt noted that the new IBM Defense Model is designed to augment human intelligence and help analysts and other officials work faster and more accurately.

“It’s really meant to be a decision-support tool. In no way do we expect this to replace the human element of decision-making for the military,” Hunt said.

Brandi Vincent

Written by Brandi Vincent

Brandi Vincent is DefenseScoop’s Pentagon correspondent. She reports on disruptive technologies and associated policies impacting Defense Department and military personnel. Prior to joining SNG, she produced a documentary and worked as a journalist at Nextgov, Snapchat and NBC Network. Brandi grew up in Louisiana and received a master’s degree in journalism from the University of Maryland. She was named Best New Journalist at the 2024 Defence Media Awards.

Latest Podcasts