This html article is produced from an uncorrected text file through optical character recognition. Prior to 1940 articles all text has been corrected, but from 1940 to the present most still remain uncorrected. Artifacts of the scans are misspellings, out-of-context footnotes and sidebars, and other inconsistencies. Adjacent to each text file is a PDF of the article, which accurately and fully conveys the content as it appeared in the issue. The uncorrected text files have been included to enhance the searchability of our content, on our site and in search engines, for our membership, the research community and media organizations. We are working now to provide clean text files for the entire collection.
One aspect of Dr. W. Edwards Deming’s philosophy has become a rallying point for those who believe total quality leadership will never be fully implemented within the naval service: the elirnination of performance evaluations and merit ratings. Dr. Deming insists that evaluation systems that rate one individUal against another destroy teamwork, Promote oneupmanship, and leave many People discouraged. Yet performance evaluations are a mainstay of the military. The Navy must revisit this system it it is to proceed as a quality organization.
In a speech during the 1985 Deming Prize Awards, Dr. Deming remarked.
The ratings of people in a group by any numerical system whatsoever ■ . . will divide the group into three groups: a) people outside the control limits on the bad side; b) people outside limits on the good side; c) the people between the limits.
Group A [and] Group B require individual attention. Now, people in between the control limits must not be ranked. . . . Differences between the limits come from the system itself, not from the people there. Everyone in Group C should receive the same increase in pay or the same bonus. There is no rightful distinction between them.1
To use a more nautical example, consider a warship standing smartly into Port. Along the main deck, seamen with heaving line in hand await the word to Put over all lines. The ship approaches toe pier and, when the word is given, half a dozen heaving lines unfurl, dragged through the air by the weight of the monkey fists toward the pier. A particularly lengthy toss makes its way to the line handlers on the pier. Another heaving line Inevitably gets tangled upon itself and jerks immediately to a halt far short of Its intended destination. Has one line torower truly outperformed the other ? Should the supervisor make future assignments based on this performance?
A sailor may make a long throw one
The Navy should concentrate its evaluation efforts on those sailors who consistently perform outside acceptable limits—either above or below.
time and a shorter one the next. Many random events—including the direction of the wind, position of the sun, weight of the monkey fist, type of line used, and the training of the thrower affect the length of a particular toss.
Should the sailors be ranked based on their performance? Deming says no. One would be evaluating the system, not the sailor. Each sailor did the best he could with what he had been given. There always will be variation between individuals and their performances. It is important for leaders to be aware of these differences and the nature of variation.
Statisticians have developed a method—standard deviation—to determine the amount of variation within a group of observations. For example, the more variation between the distances of the throws, the larger the standard deviation. If you drew all the line-toss distances on a frequency distribution, it most likely would resemble a bell-shaped curve. In a normal distribution, the peak of the curve would be located above the average or mean distance, with the longer
and shorter-than-average throws tapering off to the right or left of the peak, respectively. Most data, like the observed line tosses, will fall within three standard deviations of the average of all tosses.
Statisticians call the three standard deviations from the mean on the high side the upper control limit. Similarly, three standard deviations from the mean on the lower end of the curve is called the lower control limit. The supervisor should not rate those individuals between the upper and lower control points. The observed differences in their tosses are largely the result of the unique combinations of random influences at the moment of the toss—or common causes. Common causes are the result of the system, not the individual. Special causes of variation result in individuals falling outside the control limits or demonstrating a definite pattern of tosses within the control limits. The goal of the leader should be to identify special causes of variation through quantitative and objective methods.
Perhaps one individual is continually above the upper control limit. This person deserves the supervisor’s individual attention. He may possess a new method or skill that should be investigated. Something may be learned from that sailor that can be passed on to the others, effectively reducing the total variation of future heaving line tosses and raising the average distance of all tosses. Likewise, there may be a sailor who consistently performs below the lower control limit. This shipmate needs the supervisor’s help— better training, more time to practice, proper heaving lines, or perhaps a different task more suited to his talents.
Countless human resources are wasted in the Navy in comparing and evaluating sailors who, as a whole, are doing the best they can to operate within the system in which they work. These are sailors who fall between the upper and lower control limits.
Remember, the goal of the leader is to “shrink the control limits, to get less and less variation in a process, or less and less difference between people.” This means striving to raise the mean level of per-
broceedings / July 1993
formance for all sailors and discovering through objective, quantitative methods, which people need special attention.
It is both destructive and difficult to evaluate and compare the performances of people who fall between the upper and lower control limits. Those of us who implement the Navy’s evaluation process know the frustration of trying to break out one individual from another. Most of us can easily break out an outstanding performer, the one sailor who falls outside the upper control limit, but it is quite logical that most of our sailors perform around average. But we supervisors are so concerned that our statistically “average” shipmates will be compared with another group of shipmates whose supervisor believes it is possible for 100% of his division to be above the upper control limit, that we write 50 4.0 petty officer third class evaluations, trying to separate the “upper uppers” from the “upper lowers” by using a few more or a few less flowery descriptions. This is about the most useless, wasteful, frustrating policy the Navy has today.
The Navy must change its long-standing beliefs about performance evaluations and make the transition to quality. It probably will not do away with the current evaluation system, but, perhaps a more reasonable approach could be adopted that requires a written report of performance on only those sailors who fall outside—above or below—objectively determined control limits, at the time the event occurs rather than once a year. Commanders would not be allowed to write evaluations for more than 5% of their men. The remaining sailors would not receive performance reviews of any kind. Reports to document misconduct for administrative or punitive procedures would be treated as a separate issue.
What about all those reasons we use to justify performance reviews? In his book, Dr. Denting: The American Who Taught the Japanese About Quality, Rafael Aguayo states the following commonly used reasons for conducting an annual evaluation:
1. It provides an opportunity for a supervisor and an employee to meet and discuss what’s going on, to give each other feedback.
2. It provides a record of the employee’s performance.
3. It provides an external incentive for employees to do their best. Fear of a bad review and the hope of a good review are supposed to provide incentives for individuals to perform better.
4. When everyone tries to excel by being recognized as a top performer, it improves everyone’s individual performance, and therefore the group’s performance.
5. It helps management recognize the better performers and thus provides a basis for rewarding and promoting those who are innately more talented or have worked harder.2 Let’s examine these justifications with a more military slant. If a Navy leader is waiting for the annual review to give and solicit feedback, he is remiss in his responsibilities. Feedback should be given and sought immediately. The opportunity to praise or provide the means for positive change is always at hand. Discussing an individual’s performance is different from rating him in relation to his peers. Reason 1 justifies greater communication between seniors and subordinates, not the annual evaluation.
The documentation of training, duties, major qualifications, and other noteworthy accomplishments—Reason 2—is one area of the current performance review process that will continue to be important. Maintaining records of such events is essential to military manpower resource planning. However, the need for this documentation is not a justification for ranking sailors. A sailor’s performance is affected by the system in which he works and should not be recorded for comparison with others. The command could use a diary-type report or message to document major personnel landmarks. These qualifications and skills, not relative rankings, should be the primary basis for any selection process.
Reason 3 implies that fear is an appropriate and effective incentive program. Fear of a poor review or hope for a good one may create a short-term improvement in performance by those due for evaluation, but this temporary shift in performance increases the variability of the work force and creates needless competition between sailors. Any human being, when afraid, will do whatever is necessary to remove the threat. If the captain demanded an improvement in morale before liberty call would be passed, there would be a marked increase in the number of smiles and positive comments to those in a position to influence the call to liberty, but morale would remain unchanged or even diminish.
Reason 4 suggests that competition among sailors improves everyone’s performance. Many studies, however, have confirmed the superior achievements of groups who work in an environment of cooperation.3 Competition on the sports field can be fun, but we in the service are supposed to be playing on the same team. Even a sports team would be in trouble if each player had his own agenda for the
conduct of the game. Teamwork and cooperation are key to a successful mission- Dr. Deming points out, however, that “teamwork in a company, except for . putting out fires, is impossible under the existing annual appraisal of performance- Everybody, once the fire is conquered, goes back to his own life preserver, not to miss a raise in pay.”4
Finally, Reason 5 states that performance reviews help manpower specialists identify the better performers, allowing them to determine, among other things, who should be rewarded with promotions. This would be true only if the evaluations accurately portrayed the performance of the individual and not just the performance of the system in which the sailor worked. It is possible that a second class petty officer ranked behind three of his peers on one ship may be more effective and efficient, by any objective measure, than a sailor of the same i rate who is ranked ahead of all others at a different command. -In this instance, h | is fate—not performance—that has created a distinction between the two sailor’s ranking. If we could create an accurate evaluation system and submitted reports only on those sailors who fell outside objective control limits, promotion and other military boards could review 95% fewer performance reports, increasing their effectiveness and efficiency.
The Navy’s strategy goals for TQL state that the Department of the Navy will “. . . enhance the working environment to improve the performance of quality military and civilian personnel.” To meet this goal, the Navy must examine the way in which it defines and documents quality military personnel.
Sailors hear with their eyes, and TQL will become Total Quality Lip Service it the Navy is perceived as implementing only those parts of TQL with which it feels comfortable. Revisiting the current annual evaluation system is essential and will provide evidence of the Navy’s commitment to process improvement.
'“Foundation for Success of Japanese Industry, speech to 23rd Annual Conference for Top Management, II November 1985, Tokyo.
'Rafael Aguayo, Dr. Deming: The American Who Taught the Japanese About Quality (A Fireside book- Simon and Schuster, 1991), pp. 190-191.
'Alfie Kohn, No Contest (Houghton Mifflin Company, 1986), p. 47.
'Mary Walton, The Deming Management Method (New York: Mead. Dodd. 1986), foreword by W. Edwards Deming, p. xii.
Lieutenant Commander Sinkiewicz is reporting as operations officer on board the USS Horne (CG-30) in July. He holds a masters degree in management from the Naval Postgraduate School.
104
Proceedings / July 1993