If It Ain’t Broke, When Do We Fix It?

In recent years, the Navy has shifted from primarily time-directed maintenance to condition-based maintenance (CBM). There are sound fiscal reasons for this; replacing parts only when they are about to fail clearly saves money. This “lean” philosophy has proved highly effective in profit-driven industrial production, where saving money is another way to make money. The Navy, by contrast, has two competing goals: to save taxpayer dollars and to maintain a capable, mission-ready fighting force. CBM certainly has saved dollars. It also has had the unintended side effect of compromising mission readiness.

In a 2017 speech, then–Commander, Naval Sea Systems Command, Vice Admiral Thomas Moore used the example of tank maintenance to illustrate the problem:

I have enough data to know that when this type of ship comes in for this type of availability and the age of the ship is this, I statistically know I’m probably going to have to go work on X number of tanks. Today, what I do is I say, go inspect the tanks, and then I go open them up, and I find, oh, I’ve got to go work on them. That means the shipbuilder has to go get material and do the engineering work and I start the work late. . . . [Instead,] you know there’s going to be probably 25 tanks, so buy the material up front, do the engineering, load those resources into your plan, and then if you get in there and it’s not 25 tanks, it’s 20, okay, fine.¹

In other words, CBM has become focused on monitoring the condition of each individual part. A recent Government Accountability Office report found that, between 2015 and 2019, 75 percent of carrier and submarine maintenance periods were completed late, an average of 113 days late for carriers and 225 days late for subs.² While CBM is not entirely to blame, better outcomes can be achieved by leveraging historical data to make statistical, rather than exact, maintenance decisions. This will lead to a more efficient use of labor and inventory by enabling work packages to be developed long before the scheduled availability.

Combining data and statistics to save both time and money is precisely the hope held out by predictive maintenance (PM). A PM algorithm uses prior knowledge about a component type and the performance history of each individual component of that type to deliver (at minimum) a prediction such as: Component X has a 40 percent probability of failure within 95 operating hours.³ If the failure probability exceeds a predetermined threshold, the component is replaced. A prediction also can include a confidence score (90 percent confident that component X has a 35-45 percent probability of failure), which reflects the strength of the model’s prediction given the quality of input data. When combined with labor and material costs, this can be used to decide—in Admiral Moore’s example—which tanks to inspect and the proper resources to put on order.

Recent Successes

PM is most successful when users have continuous access to real-time data plus the ability to execute maintenance evolutions rapidly and on short notice. In the aviation community, inspections are conducted frequently (preflight, postflight, daily, turnaround, etc.), and most maintenance tasks can be accomplished in a few days. Furthermore, downing a single plane has a limited impact on overall mission readiness. Thus, a recent Defense Innovation Unit (DIU) prototype implementation conducted with the Air Force E-3 Sentry, C-5 Galaxy, and F-16 Fighting Falcon found that PM offered “the potential for a 3-6 percent improvement in mission capability; up to a 35 percent reduction of base-level occurrences of aircraft sitting on the ground awaiting parts; and up to a 40 percent reduction in unscheduled maintenance events.”⁴ DIU estimated that implementing PM across DoD aviation could save $5 billion per annum.

Ship maintenance, on the other hand, must be carefully planned around operational considerations; hundreds of maintenance items are packaged into months-long “availabilities.” Knowing what to inspect and when to inspect it is a challenge, and conveying the relevant information from a deployed ship to the cognizant in-service engineering agent (ISEA) introduces unique security requirements.⁵

Nevertheless, there have been some encouraging developments. Recently, the Navy announced a program to use Google Cloud AutoML to predict corrosion on surface ships. Drones will take high-resolution photographs of ship exteriors, which will then be labeled by corrosion experts and used to train predictive models.⁶ From the modeling perspective, this is low-hanging fruit. Computer vision is currently the best understood and most developed branch of AI; a good image classifier can be developed in less than a day by a competent undergraduate. The greatest challenge in this project will be collecting a large enough “training set” of images that have been labeled by human experts. Importantly, because exterior photos of naval vessels are not classified, there are no security concerns surrounding data transmission and storage in this application.

Another promising direction for PM is shipboard hull, mechanical, and electrical (HM&E) systems. Propulsion and auxiliary systems are minutely monitored by, among others, Machinery Condition Analysis and Machinery Vibration Analysis programs. Electronic and hand-kept logbooks represent decades of historical data that can, at least theoretically, be used to train PM algorithms. In a 2019 speech, then–Naval Sea Systems Command Deputy Commander for Ship Design, Integration, and Naval Engineering Rear Admiral Lorin Selby outlined a plan to introduce remote sensing into HM&E and incorporate this data into O-level maintenance:

I have ships with a number of sensors on them, measuring things like reduction gears, shafting components, turbines, generators, water jets, air conditioning plants, high packs, a number of components, and we’re actually pulling data off those ships, in data acquisition systems. . . . The systems that will be monitoring, say the turbine; it will tell the operators when a work procedure has to be performed, and it will also then tap into the work package side of the house and generate a work package that gets sent to the ship, to the work center, to do the work. And if there’s a part involved, it will be able to pull a part from the supply system.⁷

This sensor suite, known as the Enterprise Remote Monitoring system, is currently undergoing field trials.⁸ Naval propulsion and mechanical systems are good candidates for PM,as these cases can leverage existing software developed for the civilian power and manufacturing industries.⁹ When time is not a constraint, PM is simply a refinement of the current CBM pipeline: Instead of replacing a part when or just before it fails, use machine learning to walk back the decision point by a few weeks or months. Detecting anomalies and predicting a failure does not require any cost-benefit analysis; this task is left up to the user.

Scaling up PM to a form that comprehends an entire ship, with hundreds of systems classified at Secret or Top Secret, where it is unclear even what sort of data needs to be collected, presents unique challenges. This is especially so if the Navy is to balance cost savings with operational readiness. For big-ticket items that can be accomplished only during a drydocking availability; algorithms cannot simply predict that a component is about to fail, but also must detail the consequences of failure in the context of the overall health of the ship. This is the core of the digital twin concept, which is the notion that all the telemetry, inspection reports, and maintenance history for each vessel should be organized into a single unit—the “digital twin” of that vessel.¹⁰ Naval Sea Systems Command is committed to bringing digital twins into the fleet; the new Constellation-class frigate design process includes a fully integrated digital modeling capability.¹¹ PM requires not only digital twins, but also up-to-date knowledge of repair costs, deployment schedules, and operational commitments. This will enable the Navy to predict not merely failures, but also the consequences of failures—the only way that PM will truly work.

A Holistic Approach

The Obsolescence Management Information System tracks nearly half a million parts used by Naval Sea Systems Command, Naval Air Systems Command, the Marine Corps, and the Army. The system uses machine learning to predict when parts will become obsolete. Recently, the team has begun exploring ways to use machine learning also to enhance users’ decision-making, for example, by recommending a lifetime buy, a drop-in replacement part, or a new design solicitation for a critical system. Knowing that a part is going to become obsolete is not as important as knowing what the consequences will be and what can be done about it. To be effective on a wide scale, predictive maintenance must go the same route.

Rather than “X will fail with probability Y after Z hours,” algorithms should be able to incorporate a variety of anticipated use conditions and make recommendations based on cost to repair, cost to replace, overall ship status, and scheduling considerations. A surface ship deploying to Central Command in July will face very different environmental conditions than one deploying to European Command in December; this has major implications for corrosion control, air conditioning, and electronic systems. The failure probability should incorporate both the present condition and the anticipated future conditions of a particular vessel.

In addition, the risk of failure in a major system may become acceptable when balanced against the combined risk of multiple failures in minor systems; a PM algorithm should present a decision-maker with a variety of worst-case scenarios and their associated risks and mitigation. Automated survival analysis can be used to generate input to availability work packages that already balance cost, reliability, and deployment schedules.¹²

Tools for manually conducting such level of repair analyses have long been used by DoD and civilian industries.¹³ The Navy Common Readiness Model, a planned component of the Navy’s fleet-wide logistics IT systems upgrade, aims to integrate these tools with condition monitoring and reporting systems.¹⁴ Extending these capabilities to fully automated PM is a challenge but not a pipe dream.

The first step is to follow the roadmap laid down by big tech: Hoard data. Amazon monetizes every aspect of user data—from browsing history and click-through rates to cursor movements and user location. Long before their data miners figured out how and what to mine, management made a conscious decision to collect as much data as possible in the expectation that some day some of it would prove useful.¹⁵Data storage is cheap, sensors are cheap, and ongoing research promises to deliver fast and secure data links from naval assets in the field. Fleet-wide telemetry can be collected and maintained by type commands, Submarine Maintenance Engineering, Planning and Procurement, and Surface Maintenance Engineering Planning Program. These entities can guide the development and improvement of PM algorithms offline. Stable software releases can then help ISEAs provide near-real-time analysis to the ship using secure data links. Ship’s force will then conduct O-level maintenance and work with shipyards and other stakeholders to plan and schedule availabilities.

Further in the future, reinforcement learning algorithms can be used to identify “where to look”—which additional components need to be monitored to increase predictive accuracy. Eventually, this information will flow both up and down the maintenance pipeline, perhaps refining acceptance and testing procedures for new deliveries or determining which maneuvers and checks should be executed during the next sea trial.

Before any of this can happen, the data must be there. And it must be clean, organized, and accessible in centralized databases. While we wait for Enterprise Remote Monitoring and similar systems to reach the fleet and begin hoovering up sensor readings, the time is ripe to lay the organizational groundwork for a rapid deployment. The Navy needs to embrace the DevOps approach to PM software development.¹⁶ The proper flow of information and authority must be established to effectively integrate PM into existing maintenance programs. Machine learning solutions can then be gradually introduced into the pipeline, prioritizing the highest payoff areas first and iteratively refining data collection and automated prediction procedures.

1. Megan Eckstein, “Maintenance Planning Summit Recommends Time-Based Maintenance, ‘Tighter Learning Circle,’” USNI News, 22 February 2017.

2. Government Accountability Office, NAVY SHIPYARDS: Actions Needed to Address the Main Factors Causing Maintenance Delays for Aircraft Carriers and Submarines (August 2020).

3. See, for example, the following solicitation: “Predictive Condition-Based Maintenance for High-Powered Phased Array Radar Systems.”

4. Defense Innovation Unit Annual Report, 2019.

5. Naval Surface Warfare Center’s Port Hueneme Division was recently awarded a patent for its Secure Shipboard Information Management System, developed precisely to serve this function.

6. Google, “Google Cloud and STS to Automate U.S. Navy Maintenance Inspections Using AI and ML Technology,” press release, 27 August 2020.

7. Ben Werner, “Navy Refining How Data Analytics Could Predict Ship Maintenance Needs,” USNI News, 24 June 2019.

8. Megan Eckstein, “Navy Embracing Quicker Software Development Model to Leverage New HM&E Data Collection,” USNI News, 12 August 2019.

9. See also the Joint Fleet Maintenance Manual 2.4.4.2: “The advent of real-time machinery digital sensors, analysis tools, data recording and data transfer has brought Automated Machinery Condition Analysis (AMCA) to the forefront of CBM. AMCA systems are being employed and installed on new ships-of-the-line and back-fitted where practicable on existing ships. The AMCA tools and systems support the MCA programs and MCA surveys. The systems are implementing prognostic, diagnostic and maintenance capabilities for both shipboard and off-ship personnel to utilize to enhance understanding of the mechanical condition of propulsion plant and auxiliary rotating machinery.” See, for example, Cornelius Scheffer and Paresh Girdhar, Practical Machinery Vibration Analysis and Predictive Maintenance, (Oxford and Boston: Newnes, 2004).

10. See, for example, the recent solicitation: “Efficient Data Management to Improve Navy Maintenance and Ship Operational Readiness.”

11. Naval News Staff, “NAVSEA Builds New Frigate Readiness Digital Model in Advance of Construction,” Naval News, 19 April 2021.

12. For an overview of survival analysis in engineering, see John D. Kalbfleisch, and Ross L. Prentice, The Statistical Analysis of Failure Time Data, 2nd ed. (Hoboken, NJ: Wiley InterScience, 2002).

13. Current guidelines are set down in the AeroSpace and Defence Industries Association’s S4000P International specification for developing and continuously improving preventive maintenance.

14. As quoted by Brett Vaughn, Navy Chief AI Officer and Office of Naval Research AI portfolio manager.

15. The full program, known as Model Based Product Support, is expected to begin initial operational integration in the second half of 2021. For an overview, see www.nsrp.org/wp-content/uploads/2020/06/Model-Based-Product-Support-MBPS.pdf.

16. For an overview of the DevOps philosophy, see https://aws.amazon.com/devops/what-is-devops/.

If It Ain’t Broke, When Do We Fix It?

Recent Successes

A Holistic Approach

By Lieutenant (j.g.) Ben Rosenzweig, U.S. Navy Reserve

Related Articles

Streamline Maintenance Data to Optimize Warfighting

Maintenance Wins Wars

Improving Maintenance Culture to Retain Sailors

Quicklinks

If It Ain’t Broke, When Do We Fix It?

Recent Successes

A Holistic Approach

By Lieutenant (j.g.) Ben Rosenzweig, U.S. Navy Reserve

Related Articles

Streamline Maintenance Data to Optimize Warfighting

Maintenance Wins Wars

Improving Maintenance Culture to Retain Sailors

Quicklinks

Receive the Newsletter