One of the challenges in integrating the U.S. and its allies and partners in the Indo-Pacific is that there is a great deal of complexity in how a potential adversary might engage each of the different countries in different ways leading up to a conflict——tactically, strategically, economically, and politically. And there is just as much complexity in how each country might respond in its own way.
It is difficult for wargaming and exercises to fully capture this complexity, with its clues to effective mission-partner integration. However, an emerging form of AI known as reinforcement learning can play an important role. Essentially, this technology makes it possible for each country in a virtual wargame— whether an adversary, the U.S., an ally, or a partner—to be represented by its own AI “agent.”
Each agent—a sophisticated algo- rithm— brings together and analyzes vast amounts of data about that country, including its military capabil- ities, its political and economic environment, and its posture toward the other nations. A unique feature of reinforcement learning is that allows the AI agent to pursue its own best interest, so that in a wargame repre- senting a country, the AI behaves much like that country would.
This can provide valuable insight into the often-difficult challenges of mission-partner integration. For example, an AI agent representing a critical partner in the Indo-Pacific might discover, over multiple scenarios, that certain security cooperation activities would likely elicit economic or diplomatic pressures from an adversary, and that the best course of action would be to disengage and remain neutral.
Or, the AI agent might find that that if allies or partners have certain defensive weapons or other protections in place before a conflict, that would deter—or at least defer—adversary aggression. Such AI-informed scenarios can help map out the steps needed to make sure our allies and partners get the capabili- ties they to maximize deterrence.
Defense organizations are already beginning to use reinforcement learning in operational planning, by wargaming how opposing forces might engage tactically in battle. But rein- forcement learning can go even further, by helping to integrate the U.S. and its allies and partners in the Indo-Pacific through all phases of competition, crisis, and conflict, to help create a force of forces.
How Reinforcement Learning Works
With reinforcement learning, algo- rithms try to achieve specific goals, and get rewarded when they do. Using trial and error, the algorithms test out random possible actions. The closer those actions get the algorithms to their goals, the higher their score. If the actions move the algorithms way from their goals, the score drops.
In this way, the algorithms can rapidly work through thousands or even hundreds of thousands of scenarios, in a game-like setting, to determine the best course of action. With each iteration, they learn more about what works and what doesn’t, and get closer and closer to the optimal solution.
Because the algorithms can perceive their environment in a virtual wargame, and participate autono- mously, they are considered to be AI agents. And reinforcement learning is well suited for wargaming. An AI agent
can take a side and play a role, trying to achieve its own specific goals and learning as it goes along. Just as important, multiple agents in a wargame—for example, representing various allies and partners in the Indo-Pacific—can learn how to best work together to achieve common goals in the face of an adversary.
Virtual wargaming is just one example of how reinforcement learning can assist defense organizations. It can also help optimize weapons pairing, the kill chain process, cybersecurity, and other challenges.
How Reinforcement Learning Is Trained
The process of integrating allies and partners with reinforcement learning begins by bringing together a wide range of data about a particular country. In addition to information on the country’s military and other resources, it can include its recent history—for example, how an ally’s economy and politics were affected by outside pressures in the past, and how the country responded when faced with certain pressures from an adversary. All this information teaches the AI agent what kinds of actions it might see from agents representing other countries, and what kinds of actions it can take on its own.
At the same time, the AI agent is provided with that country’s goals, based on the knowledge of experts on its culture, politics, economy, military, and other areas. The agent is then programmed to use the actions at its disposal to achieve those goals. While it may be impossible to capture the full picture of a country—or the complete international environment—even limited AI agents, interacting with one another, can provide important insights. And as new information about countries is added into the mix, AI agents continually learn.
Reinforcement Learning In Action
In a virtual wargame, AI agents for the adversary, the U.S., and various allies and partners enter a scenario and begin interacting with each other autonomously—each balancing its own strengths and weaknesses to achieve its goals the best way possible. In one scenario, for example, an adversary might try to use economic or diplo- matic coercion against a number of different allies and partners at the same time, or launch sophisticated disinformation campaigns designed to pit countries against one another and break apart the coalition.
With each country pursuing its own best interest, the AI agents can reveal how they might work together against the adversary, or splinter from the others. A partner in the Pacific might decide to provide some assets to the coalition, but not others. An ally might be particularly susceptible to an adversary’s disinformation campaign, and refuse to cooperate with other allies or partners. These kinds of scenarios can suggest actions the U.S. and its allies and partners might take, which they can then try out as the virtual wargame continues.
A wargame can play out with hundreds of thousands of iterations, giving the AI agents the chance to try out any number of possibilities, and find the best solutions. Throughout the process, domain experts continually verify the AI agent’s goals and actions, making sure they accurately reflect the real world.
Reinforcement learning doesn’t replace current approaches to wargaming, planning and other activities. Rather, it is a powerful tool to aid decision- making, as leaders seek to integrate the U.S. and its mission partners into a potent force of forces in the Indo-Pacific.
Lt. Col Michael Collat ([email protected]) is a Booz Allen principal leading the delivery of data analytics, counter-malign foreign influence, and digital training solutions across USINDOPACOM. A former Air Force intelligence and communications officer, he has also led projects delivering cyber fusion processes, information operations assessments, and regional maritime and aerospace strategies.
Vincent Goldsmith ([email protected]) is a Booz Allen solutions architect providing transfor- mational technical delivery across USINDOPACOM. He focuses on wargaming, modeling and simulation, immersive, cloud, and AI solutions, and he partners with warfighters in region to integrate the latest innovative technology into their base- lines, to advance the mission.