General David H. Berger issued the Commandant’s Planning Guidance (CPG) in July 2019, outlining his plan to transform the Marine Corps to fight peer adversaries. The CPG identified force design as the top priority for the service, and work on it began almost immediately.1 A small planning team met to provide direction for follow-on efforts, work that entered its second phase in September 2019, when this team transferred responsibility to the office of Lieutenant General Eric M. Smith, Deputy Commandant for Combat Development and Integration (CD&I).
Twelve separate teams convened during the second phase to assess the current state of the force and provide recommendations for its future. These planning teams focused on specific topics, such as infantry battalion organization and the Reserve component. An interim report—Force Design 2030—synchronized and summarized their outputs to produce an overarching plan for the third and fourth phases. Released in March 2020, the report outlines broad changes to the operating forces, including eliminating tanks and divesting most cannon artillery batteries from the Fleet Marine Force, as well as increasing the number of unmanned aerial system squadrons.
The third phase of force design will be “rapid and iterative wargaming, analysis, and experimentation” to “evaluate and refine” the initial results of Phase II.2 The Commandant describes Phase III: “I am directing the immediate implementation of an intensive program of iterative concept refinement, wargaming, analysis and simulation, and experimentation.” This effort will be led by the Marine Corps Warfighting Laboratory, although General Berger states: “I will be personally involved in and responsible for setting priorities and ensuring that necessary resources are made available.”3 Phase III of force design will refine and validate the proposed changes to the Fleet Marine Force and its operating concepts. Once the service is confident the proposed concepts are mature enough for adoption, the fourth phase will be “refinement, validation, and implementation.”4
Phase IV is crucial because it will determine whether the operating concepts and structure are suitable for implementation. Experimenting, prototyping, and testing are distinct activities, and each has different roles in the evaluation of new ideas. The Warfighting Lab will develop the Force Design 2030 experimentation plan—and it is more than capable of developing and prototyping the necessary future operating concepts and force structures.5 But these must be validated independently, through a transparent and reproducible operational test prior to adoption. The Warfighting Lab cannot conduct these tests because it is not an independent assessor; it reports to CD&I. And because it is an organization that conducts experiments, it should not be tasked with operational testing.
Experiment. Prototype. Test.
Experimenting, prototyping, and testing are closely related concepts, yet subtle distinctions exist. Experimentation determines the existence and strength of a causal relationship between two factors: Does X cause Y? Prototypes are the models experiments use to test premises and draw conclusions. Prototyping is the iterative process of creating a model for an experiment, conducting the experiment and evaluating data about the model’s performance, altering the design of the model based on the experiment’s conclusions, and then repeating this process to optimize the model’s performance. Testing is the verification and validation that a system or concept meets the previously defined thresholds of performance.6
The Phase II outputs—Force Design 2030—are hypotheses. The planning teams believe, for example, that the removal of heavy ground armor from the Marine Corps will enable the Fleet Marine Force to win a great-power naval campaign. Experimentation will determine whether a relationship between the two variables (absence of tanks and success in a naval campaign) exists. Data from these experiments will then be fed into the design process: A force-structure prototype will be developed to explore these hypotheses, the model will undergo an experiment, data from the experiment will be analyzed and used to modify the prototype, and the process will be repeated until the prototype is optimized. Once complete, the final prototype should be tested operationally to validate its performance.
Developmental vs. Operational Testing
Every major defense acquisition program goes through an intensive regimen of testing before it is fielded.7 There are two types of testing: developmental and operational.
Developmental testing is used to verify the performance of a system in a controlled environment. Marine Corps Systems Command (SysCom) is responsible for developing material solutions. It conducts developmental testing to verify that a system is capable of performing to the required thresholds. Such testing measures, say, the power output of a communication transmission system against the power threshold articulated in its requirement documents. This is an internal process for SysCom, because its personnel confirm that the equipment they designed is capable of meeting the required performance standards.8 Put simply, developmental testing verifies that a system functions as expected.
Once developmental testing is complete, the system proceeds to operational testing, which tests its performance in realistic conditions. Operational testing ensures that an average Marine can use the system to produce the desired military effect under expected conditions.9 Each service has a statutory operational test authority (OTA)—in this case, the Marine Corps Operational Test and Evaluation Activity (MCOTEA).10
To understand the difference between developmental and operational testing, think about buying a car. Car manufacturers do developmental testing to determine the fuel efficiency of a new model, but anyone who has bought a car knows they are unlikely to replicate the reported mileage. A car buyer concerned about fuel economy will get information on how the car has performed under realistic conditions from an independent evaluator before making a purchase. This independent evaluation likely will provide more realistic fuel estimates than the car manufacturer.
In the Marine Corps, SysCom is, roughly speaking, the car’s manufacturer, while the independent evaluator is MCOTEA. Because MCOTEA reports to the Assistant Commandant of the Marine Corps, SysCom personnel cannot unduly influence the conduct or results of an operational test. The separation prevents conscious or unconscious bias from seeping into the evaluation. MCOTEA personnel cannot start the testing having “fallen in love” with the system because they did not design or build it.
The requirements for operational testing and independence of the testing organization are statutory obligations, not norms or traditions. Congress legislated them to ensure the Department of Defense spends taxpayer funds wisely. Although operational testing typically is performed to appraise equipment, the Corps has conducted operational testing of concepts as well. Most notably, MCOTEA evaluated integration of women into the infantry.11
The ideas in Force Design 2030 represent a fundamental shift for the Marine Corps. As such, they must be tested thoroughly. The Marine Corps Warfighting Lab, the organization tasked with this, does not have the expertise to perform an operational test under realistic conditions, as demonstrated by its previous centerpiece exercise: Sea Dragon 2025.
Enter the Sea Dragon
Sea Dragon 2025 began in 2016 as a Marine Corps experiment to explore new ground combat capabilities and technology as a “quest for solutions to the problems of tomorrow.” It featured 3rd Battalion, 5th Marines, testing new equipment and organizational structures, then progressing through a training workup that included two service-level exercises, Integrated Training Exercise and Exercise Talisman Sabre 2017.12
Sea Dragon explored the relationship between two variables: new capabilities and “the essential elements of [the] Marine Corps operating concept.”13 As an experiment, it took place under controlled—not operationally representative—conditions. The battalion was manned at 100 percent strength, and, “to the greatest extent possible,” each of its members was of the appropriate rank and trained to the requirements of his or her billet. How many ordinary infantry battalions have 100 percent of their tables of organization filled with trained Marines of the correct rank? The event was not an independent evaluation because Combat Development and Integration developed the Sea Dragon effort, and the evaluator was the Marine Corps Warfighting Laboratory, a subordinate entity of CD&I.14
Some of the “unknowns” determined during Sea Dragon 2025 could reasonably be applied to the rest of the Corps, or at least identified for future experimentation and testing. But the findings should not have been considered conclusive, because the experimental force comprised an idealized battalion unrepresentative of the actual Fleet Marine Force. What this says about the rest of the exercise is anyone’s guess. What was the design of the experiment? How many trials were conducted? How was the data collected and analyzed? Were the results statistically significant? There is no way of knowing.
The design of the experiment and the data collected should be published so they can be independently verified. Absent that, the experiment cannot be repeated, its results cannot be validated, and its conclusions therefore must be treated with skepticism. A cynic might conclude that the Warfighting Lab report was unduly influenced by the fact that it was evaluating an event developed by its higher headquarters, or that the results are not applicable to the fleet because the test force differed so drastically from a regular line unit. Operational tests are supposed to be structured to avoid these issues.
Sea Dragon 2025 was an experiment—an exploration of the relationship between variables in a controlled environment. But some of its findings have been treated as though they came from an operational test, e.g., as though they were conclusive determinations. The Warfighting Lab has the expertise to experiment and prototype force design models, but until its results are tested operationally under realistic conditions with a representative test force, they should not be adopted by the entire Marine Corps.
Testing Force Design
Force Design 2030 is a radical shift for a service that has not fundamentally reconsidered its structure since the 1950s, and as such, the Corps must take care to get the proposed design right.15
Title 10 requires every major defense acquisition program to undergo operational testing.16 Since Congress mandates independent testing for equipment, would it not also be interested in seeing the Marine Corps test the operating concept and force structure on which every future equipment purchase will be based? Representative Mike Gallagher (R-WI), a member of the House Armed Services Committee, has expressed skepticism about the utility of the current operating model in the Indo-Pacific and has called for structural changes.17 An operational test of Force Design would demonstrate the viability of the Corps’ proposed vision and, if conducted jointly with the Navy, demonstrate that both services are capable of integration and interoperability during a maritime campaign. There are billions of dollars—and thousands of lives—at stake.
Operational testing ensures the Marine Corps employs capabilities and concepts that will help it win on the battlefield. Such assessments must be conducted in a representative environment, with a representative unit from the operating forces, to conclusively determine the suitability and effectiveness of the Force Design solution. Current plans for testing have a number of flaws.
Independence. The purpose of an independent test event is to ensure objectivity and eliminate pressure on the testing entity. Force Design 2030 is a major pivot for the Marine Corps, and the Commandant has announced a timeline and his personal involvement in its conduct.18 One of Force Design’s most important operating concepts—expeditionary advanced base operations—originated at the Warfighting Lab. Yet the current proposed evaluation plan calls for the Warfighting Lab to evaluate the Force Design outputs. This structure does not allow for an independent evaluation: The lab is assessing itself.
The same goes for the personal involvement of the Commandant. Service culture being what it is, anyone involved in the evaluation could easily construe a link between the viability of his or her career and a positive test result. The Uniform Code of Military Justice recognizes the problem of undue command influence, whereby statements by someone with authority that could influence or appear to influence judicial proceedings can result in the overturning of the case. There is a similar principle at play here that could be avoided by tasking an independent entity to conduct the evaluation. Since the Marine Corps is a naval service and implementing Force Design requires interoperability with the Navy, consider tasking the Navy’s Operational Test and Evaluation Force to take advantage of its independence and expertise in the maritime domain.
Operationally representative test force. A recent War on the Rocks article (coauthored by a Marine from the Warfighting Lab) called for a servicewide application process to “ensure the experimental process is filled with the most passionate Marines.”19 By all means—the most talented and interested members of the service should be involved in experimentation and prototyping to refine and validate the proposed construct. However, they cannot be disproportionately involved in operational testing, because their inclusion would represent selection bias. The “most passionate” Marines are, by definition, not representative of the fleet. The test force should be constructed after a review of historical data to determine typical manning levels and training completion in the force today. The test force must be representative of the average operating force unit because, in a crisis, that is the “force in readiness” that will be employed, with little time for training or augmentation.
Operationally representative environment. Since the Commandant called for testing to occur in realistic maritime and littoral terrain and said, “We will not seek to hedge or balance our investments to account for [other] contingencies,” the Corps must define the environment and terrain conditions in which it anticipates the next conflict will occur.20 This would ensure current equipment in testing will be optimized for that specific environment, rather than hedging across the entire terrain, temperature, and humidity spectrum.21 Disseminating specific guidance would have the added benefit of informing the current operational testing of systems that will be in the inventory for the next few decades. This would ensure that equipment currently in development will be optimized to perform in a specific intended environment, rather than across the entire temperature and humidity spectrum.22 Sea Dragon 2025 exercises took place in Darwin, Australia, and Twentynine Palms, California. Neither is representative of relevant parts of the western Pacific. Further, they took place on restricted military training grounds that lack robust civilian populations or major urban centers. As such, they were not representative information environments. Identifying a testing site that is similar to the Indo-Pacific maritime environment that can provide representative littoral, informational, and human terrain for testing would increase the realism of the test.
The concepts in Force Design 2030 represent a bold leap for the Marine Corps, but the Corps must get it right. Doing so requires the right kind of independent operational testing. This is not about legal compliance, additional bureaucracy, slowing force transformation, or poking holes in Sea Dragon 2025. If we hand future Marines unworkable operating concepts and force structures, we will kill them. Houses built on sand will not stand.
1. GEN David H. Berger, USMC, Commandant’s Planning Guidance (Washington, DC: U.S. Marine Corps, 2019).
2. GEN David H. Berger, USMC, Force Design 2030 (Washington, DC: U.S. Marine Corps, 2019), 5 and 8.
3. Berger, Force Design 2030, 11, 18: “A major focus of my tenure as Commandant will be my direct, personal, regular engagement with our Warfighting Laboratory to drive an integrated process of wargaming and experimentation that will rapidly produce solutions for further development in accordance with my guidance and vision.”
4. Berger, 5.
5. Berger, 18.
6. Office of the Under Secretary of Defense for Research and Engineering, Department of Defense Experimentation Guidebook, August 2019 (Version 1.0), 5–8.
7. 10 U.S. Code § 2399,“Operational Test and Evaluation of Defense Acquisition Programs.”
8. Defense Acquisitions University, Guidebook Glossary of Defense Acquisition Acronyms and Terms, “Developmental Testing.”
9. Defense Acquisitions University, Guidebook Glossary, “Operational Testing.”
10. 10 U.S. Code § 2399.
11. COL Keith Moore, USMC, Marine Corps Operational Test and Evaluation Activity, Ground Combat Element Integrated Task Force Experimental Assessment Report (Quantico, VA: U.S. Marine Corps, August 2015).
12. GEN Robert Neller, USMC, Sea Dragon 2025, ALMAR 024/16, August 2016.
13. Neller, Sea Dragon 2025.
14. COL C. Pappas III, USMC, Release of Final Report for Sea Dragon 2025 Phase 1 Experiment Plan, MARADMIN 186/18, March 2018.
15. Pappas, Release of Final Report, 2.
16. 10 U.S. Code § 2399.
17. Rep. Mike Gallagher, “To Deter China, the Naval Services Must Integrate,” War on the Rocks, 4 February 2020, warontherocks.com.
18. Berger, Force Design 2030, 11.
19. Jeff Cummings, Scott Cuomo, Olivia A. Garard, and Noah Spataro, “Getting the Context of Marine Corps Reform Right,” War on the Rocks, 1 May 2020, warontherocks.com.
20. Berger, Force Design 2030, 6; Berger, Commandant’s Planning Guidance, 5.
21. Equipment must undergo testing to comply with uniform engineering and technical requirements to ensure it can operate in all climates and conditions. For future systems, these requirements should be optimized for the specific characteristics of the environment in which the Marine Corps anticipates fighting.