Skip to main content
USNI Logo USNI Logo USNI Logo
Donate
  • Cart
  • Join or Log In
  • Search

Main navigation

  • About Us
  • Membership
  • Books & Press
  • USNI News
  • Proceedings
  • Naval History
  • Archives
  • Events
  • Donate
USNI Logo USNI Logo USNI Logo
Donate
  • Cart
  • Join or Log In
  • Search

Main navigation (Sticky)

  • About Us
  • Membership
  • Books & Press
  • USNI News
  • Proceedings
  • Naval History
  • Archives
  • Events
  • Donate

Sub Menu

  • Essay Contests
    • About Essay Contests
    • Diversity & Inclusion
    • Enlisted Prize
    • NPS Foundation
    • Naval Mine Warfare
  • Current Issue
  • The Proceedings Podcast
  • U.S. Naval Institute Blog
  • American Sea Power Project
  • Contact Proceedings
    • Submission Guidelines
    • Media Inquiries
  • All Issues
If the Navy fails to understand and actively manage data bias, it could develop incorrect planning assumptions, and future leaders could find themselves without the resources they need
If the Navy fails to understand and actively manage data bias, it could develop incorrect planning assumptions, and future leaders could find themselves without the resources they need
Shutterstock

Sub Menu

  • Essay Contests
    • About Essay Contests
    • Diversity & Inclusion
    • Enlisted Prize
    • NPS Foundation
    • Naval Mine Warfare
  • Current Issue
  • The Proceedings Podcast
  • U.S. Naval Institute Blog
  • American Sea Power Project
  • Contact Proceedings
    • Submission Guidelines
    • Media Inquiries
  • All Issues

Seeing Through the Fog of Data Bias

By Eileen Chollet
December 2020
Proceedings
Vol. 146/12/1,414
Nobody Asked Me, But . . .
View Issue
Comments

According to one database, over the past ten years, cruiser deployments have averaged eight months. The shortest deployment was two weeks, and the longest was more than two years. Every entry in that data-base is complete, correct, and authoritative. And yet, any budgeteer or analyst who used those numbers would end up wildly off target, because those statistics lump together forward-deployed and rotational cruisers, which use markedly different definitions of “deployment.” 

If a distorted average becomes an incorrect planning assumption, future leaders could find themselves without the resources they need. 

Truth in data depends on definition, interpretation, and approximation. When data approximations differ from the “truth” according to some specified definition, statisticians call that “data bias.” If the future digital Navy fails to understand and actively manage data bias, it will find itself off course. 

Over the past few years, Navy leaders have rightly focused on the competitive advantages that data offer. In an April 2019 article in Defense One, Admiral William Moran warned of adversary attempts to “dominate the data domain” and urged the Navy to focus on “high-quality data input at any entry point.” The Secretary of the Navy’s 2019 Cybersecurity Readiness Review lays out the risks the Navy faces when it can no longer trust the confidentiality, integrity, and availability of data. Meanwhile, former Chief of Naval Operations Admiral John Richardson’s A Design for Maintaining Maritime Superiority 2.0 emphasized the foundational role high-quality data plays in decision-making. But even if the data foundation appears perfect, with every entry complete and correct, bias can imperceptibly erode the competitive edge. 

Some data bias always will occur when operating in real time and in the real world, rather than under laboratory conditions. Location bias, the overrepresentation of information physically closer to the data collector, is a concern when forces are distributed. “Big data” is great when you can acquire it, but time and funding limitations almost always limit sample sizes, leading to predictions from too few points. And it is human nature to empathize with the planners and participants of an experiment, exercise, or wargame, leading data collectors to leave out points they feel should not “count,” so that the event succeeds. 

The bias problem only grows when there is no right answer. “How many ships does the Navy have?” may be quantitative, but it is also squishy. Is a submarine a ship? How about the USS Constitution? Include the National Defense Reserve Fleet? The definition of “ship” will bias the data one way or the other, producing a smaller or larger number depending on the choices the data analyst makes—consciously or unconsciously. In 2014, the Navy began counting hospital ships, some patrol craft, and cruisers in reduced status as part of the battle force. Congress, seeing politics where perhaps there was none, forced a return to the old counting rules less than a year later. Any model of the battle force data that does not take these definition changes into account could see trends where there are none. 

Managing data bias does not require the most sophisticated algorithms, machine learning, or supercomputers; to keep data bias from undermining the digital Navy, we need to cultivate people. The Navy’s data stewards—the people responsible for the collection, management, and administration of data sets—need specialized training to understand how data bias occurs, and to be instinctively skeptical of analytic results presented with only a hand wave at methodology. Data stewards need operational experience and longevity with a particular data set to understand the choices made when the data were collected and curated. And data stewards need to maintain their objectivity and independence, so their funding or fitness reports cannot depend on the data telling only the stories that leaders want to hear. The digital Navy that the nation needs depends on it. 

Eileen Chollet

 Dr. Chollet is research scientist in the Fleet Operations and Assessment Program at CNA. She specializes in information management, information and operational security, and Navy force employment and force structure assessment.  

More Stories From This Author View Biography

Related Articles

Ever-increasing numbers of sensors provide intelligence analysts an overwhelming amount data. Leveraging artificial intelligence and machine learning to mine, exploit, and present that data in an understandable way will give analysts time for in-depth analysis of an adversary.
Article

Empower Naval Intelligence with Data Analytics

By Commander Henry Lange, U.S. Navy
September 2020
As the amount of data that must be filtered increases, the intelligence community needs new tools to allow analysts’ time for in-depth assessments.
Norfolk Naval Shipyard’s cyber IT and cybersecurity functions throughout the shipyard to operate safely and securely, but the collection of worker’s personal data at work and at home poses risks to the system.
Nobody Asked Me, But . . .

The Next Insider Threat: Open-Source Data Aggregation 

By Lieutenant Derek S. Bernsen, U.S. Navy
July 2020
Personal data is an online commodity, and it is a particular threat to military and government personnel.
Marines prep for a mission
Commentary

Education Is the Technology the Navy Needs Most

By Michael Freeman and Colonel Todd Lyons, U.S. Marine Corps (Ret.)
July 2020
Changes in warfare and warfighting technology make improved educational content and delivery critical for preparing for the next fight.

Quicklinks

Footer menu

  • About the Naval Institute
  • Books & Press
  • Naval History Magazine
  • USNI News
  • Proceedings
  • Oral Histories
  • Events
  • Naval Institute Foundation
  • Photos & Historical Prints
  • Advertise With Us
  • Naval Institute Archives

Receive the Newsletter

Sign up to get updates about new releases and event invitations.

Sign Up Now
Example NewsletterPrivacy Policy
USNI Logo White
Copyright © 2022 U.S. Naval Institute Privacy PolicyTerms of UseContact UsAdvertise With UsFAQContent LicenseMedia Inquiries
  • Facebook
  • Twitter
  • LinkedIn
  • Instagram
Powered by Unleashed Technologies
×

You've read 1 out of 5 free articles of Proceedings this month.

Non-members can read five free Proceedings articles per month. Join now and never hit a limit.