The Department of Defense is set to kick off a pilot in the coming weeks that aims to help the U.S. military assess the trustworthiness of artificial intelligence systems.
The new Center for Calibrated Trust Measurement and Evaluation initiative, sponsored by the Office of the Secretary of Defense for Research and Engineering, will be coordinated by Carnegie Mellon University’s Software Engineering Institute. It will also involve other partners inside and outside DOD, according to Kim Sablon, principal director for trusted AI and autonomy within the Pentagon’s R&E directorate.
“We’re pooling really a consortium … There’s opportunity for industry to play. And the idea for us is how do we bring the test and evaluation community with the research and development community together, to really start thinking through what the standard method process is for even providing evidence for assurance, and for measuring and calibrating trust along a human-system balance of roles and responsibilities, when we’re talking about different teams, you know, whether it’s dyadic or heterogeneous teams, and so forth,” Sablon told DefenseScoop Tuesday on the sidelines of NDIA’s Emerging Technologies for Defense conference and expo.
“We’ve got to have standards by which industry can adhere, so they know when they’re working with us and they’re providing solutions, we’re all in lockstep with common frameworks, and so forth,” she added.
As the Pentagon pursues what it calls “responsible AI” and trustworthy autonomous weapon systems, technologies will have to go through verification and validation processes before they can be fielded.
“We do need to bring the acquisition [community] with the research, developers and the test and evaluation folks together to really get to the heart of some of the V&V challenges,” Sablon said. “I really see [the pilot] as a way for us to operationalize responsible AI, really operationalize value alignment.”
The pilot will be addressing fundamental complexities and engineering challenges associated with AI assurance, and developing some of the standard framework methods and processes for how the Pentagon evaluates trustworthiness.
“It’s ethics by design and security by design, where it’s warfighter-in-the-loop design, development and training. Because trust is a validation conversation, and that happens naturally when warfighters are training together, then trust evolves, right. So we want to start qualitatively pulling data from tabletop exercises to inform some of the simulation exercises and ultimately inform” other aspects of the process, Sablon said.
The new initiative will also focus on workforce and education.
“When we’re talking responsible use of AI, I think it requires a cultural change, OK. And so we’ve got to have a science of responsible AI test-and-evaluation degree programs or professional certification programs in place. And we’re starting to work with various academic institutions locally, and there’ll be a broad agency announcement for that. But we’ve got to get the workforce and education component right. And that’s going to be a part of [the effort] as well,” she said.
The pilot is slated to officially launch before Sept. 30, which marks the end of fiscal 2023.
About $20 million in funding is being allotted for it.
“We’re going to carry it through, certainly, FY ’24 … and I’m hoping that in the ’25 budgets, you know, we can sustain that. But we’ve been working very closely with CDAO [the Pentagon’s Chief Digital and AI Office] as well. Test and evaluation is also within their mission. And so there’s opportunity there for them to really take this to the next level, especially when we’re talking about standards for providing evidence for assurance,” Sablon said.
Sablon’s comments came a day after Deputy Secretary of Defense Kathleen Hicks announced a new initiative known as Replicator, which calls for fielding thousands of autonomous systems — like uncrewed aircraft and underwater drones — in the next two years or less, to help counter China’s military buildup.