Reliability Review
Volume 20 Issue No. 2, June 2000
Reprinted from pp. 13-19.
Raising the Reliability Benchmark
by Larry George and Eva Langfeldt
"One can act only on the available information, and often one must." (Herman Rubin, math/sci.stat.consult newsgroup posting)
WHAT IS RELIABILITY?
Reliability is the probability of success at a specified age under specified conditions [OConnor]. Success is defined by customers. Age is in calendar or operating time, cycles, or another measure. Relevant conditions are field conditions, not laboratory conditions or demonstration test conditions. Field reliability is what happens in real life.
Two functions describe reliability. The age-specific failure rate function contains the same information as the reliability function, but the failure rate function shows possible causes of problems better than the reliability function.
Companies need age-specific field reliability and failure rate information for all products and parts over their useful lives. The new benchmark that this article introduces satisfies the need cost-effectively, using available data.
WHAT IS THE OLD BENCHMARK?
The generally accepted (old) reliability benchmark is to track parts by serial number from production to failure and use ages-at-failure and survivors ages to estimate age-specific field reliability for all products and parts by calendar production interval. An alternative (new) reliability benchmark is to use ships and returns data (required by generally accepted accounting principles) to estimate age-specific field reliability
FEATURES OF THE OLD BENCHMARK
The old benchmark provides population age-at-failure data, from which you can compute nonparametric maximum likelihood reliability estimators that preserve all information in the data. They have little bias and minimum variance, assuming that all failure data has been recorded since first production.
You can use the empirical reliability function if you can wait until all products fail or use the Kaplan Meier estimator [Kaplan and Meier] for right-censored failures if you cant wait until all fail. Alternatives for left-censored failures (unobserved prior failures) arent commonly used, although they are commonly needed. Have you ever heard, "The data has been archived"? That means all the old failure data is in an unreadable format on punch cards in a salt mine in Kansas.
DISADVANTAGES OF THE OLD BENCHMARK
A large amount of data is required, and the use of the old benchmark incurs large numbers of errors, indifference, and other problems. Tracking vehicle warranty returns, for instance, requires a mainframe computer and an expensive database and annoys the people who have to record data. Those who try to get information say, "Its easy to put data in, but it sure is hard to get information out."
Some companies used to track all products and parts by serial number. IBM, Xerox, and Applied Materials used to track products and parts by serial number but decided it was too costly. They and others resorted to inferior alternatives, including the following:
· Apply the old benchmark to a sample. Samples are criticized for not being representative of the population, and samples are criticized because "its human nature to disbelieve statistically significant results based on a sample that is a small fraction of the population [Tversky and Kahneman]."
· Test a sample and hope the test reliability resembles the field reliability. Tests are criticized for not reflecting field conditions, and accelerated tests are criticized for introducing failure modes. Some companies have found that the field MTBF is down around the fifth percentile of the MTBF estimate based on test data.
· Graph annual failure rate, returns/1000, or other conveniently calculated measure. Argue pointlessly about wiggles in the graph.
· Predict MTBF and compare it with 1/AFR. Hope all is well and go to work on the next project before field reliability information begins to look bad.
The last two alternatives dont even provide age-specific reliability information, and predictions suffer from lack of credibility.
WHO USES THE OLD BENCHMARK?
Some examples: The Air Force Material Command uses actuarial methods. (Actuarial rates are discrete, age-specific failure rates.) The FAA requires airlines to track about 75 "fracture critical" parts per aircraft. Several car OEMs track all warranty returns by VIN and part number.
WHAT DOES THE NEW BENCHMARK INVOLVE?
· Use ships (sales, production, etc.) and product and part return data (complaints, failures, repairs, spares sales, etc.) instead of age-at-failure data to estimate field reliability for all products and service parts.
· Use bills-of-materials to convert product ships into parts installed base by age.
· Estimate field reliability and failure rate functions for failure mode, customer, and production batch.
· Quantify uncertainty in estimates and forecasts with confidence limits.
· Use the estimates when and where appropriate.
FEATURES OF THE NEW BENCHMARK
Generally accepted accounting principles require population ships and returns data [Plank], which are as accurate as accounting data. Ships and returns data requires about a thousandth of the storage space that tracking parts by product and part serial number requires; thats a PC versus a mainframe. The maximum likelihood estimator [George and Agrawal] is asymptotically unbiased and minimizes variance compared to alternative estimators from ships and returns data. Information (inverse entropy) has the same order of magnitude as Kaplan Meier estimates with the same censoring, so you dont have to track parts by serial number.
DISADVANTAGES OF THE NEW BENCHMARK
Some people say it cant be done [Krupp]: "You need to know how old each product or part was when it failed." Theyre wrong. Others say its too hard: "Returns could have come from any production interval; you need to express the returns probability distribution in terms of the reliability from all production intervals." Theyre mostly right. It is hard.
Do reliability specialists really want to know nonparametric, age-specific field reliability of all products and parts, from all vendors, in all failure modes, for all customers, over all production intervals? They resist change because they have a personal stake in the way things are done, in addition to major investments in obsolescent methods, mainframe computers, huge databases, and procedures for tracking parts.
WHO HAS USED THE NEW BENCHMARK?
Communications · phones, microwave devices, satellites, network switches
Electronics · power supplies, computer parts and peripherals, engine controllers, ultrasonic cleaners, switches, semiconductor capital equipment
Transportation · autos, trucks, and aircraft
Power · switches, digital relays, gas turbines
Medicine · clinical laboratory equipment, hemoanalyzers, stents, imaging, lasers, glucometers, respirometers, CO2 sensors
Biostatistics · HIV+-to-AIDS time and AIDS survival time, without patient identification
A network switch company uses old products field reliability to predict new products field reliability, forecast service requirements, and recommend the contents of spares kits. Old product failure rate functions are estimated from ships and returns data (new benchmark). New product failure rate functions are predicted to be those of old products multiplied by the ratio of MTBF predictions, MTBF(new)/MTBF(old).
An automotive aftermarket information supplier sells actuarial forecasts that use the age-specific failure rate estimated from sales, parts catalogs, and vehicle registrations (new benchmark) and the installed base. They have less bias and more precision than time series extrapolations.
Figure 1 shows hindcasts (forecasts of past returns, labeled E[Returns]) and forecasts with an upper confidence limit. The x-axis lists months since beginning of production. The product had been in the field 14 months, so returns were observed for 14 months and forecasts begin at age 15.

Figure 1. Hindcasts and forecasts with confidence intervals
One goal of an aircraft company was a 5% engine failure rate per year during the two-year warranty period. Figure 2 shows the quarterly failure rates. The first-year failure rate was 5.8%; the second-year failure rate was 16.6%.

Figure 2. Quarterly failure rates
A medical laser supplier used field reliability to improve its products. The age-specific failure rate showed infant mortality due to manufacturing defects and vendor differences. It published its products field reliabilities, an unprecedented disclosure that led to an increase in market share.

Figure 3. Laser reliability
NEW BENCHMARK METHODOLOGY
To use the new benchmark, input ships and returns (or equivalent); use bills-of-materials to convert products installed base to parts installed base, if necessary. Missing data is OK. Table 1 shows typical data from car manufacturer, monthly vehicle ships, and drivetrain warranty returns.
Table 1. Typical ships and warranty returns input
| Month |
Ships |
Returns |
| Aug-87 |
213 |
18 |
| Sep-87 |
6439 |
797 |
| Oct-87 |
6951 |
1291 |
| Nov-87 |
5715 |
1511 |
| Dec-87 |
5390 |
1791 |
| Jan-88 |
6336 |
2282 |
| Feb-88 |
6319 |
2628 |
| etc. |
etc. |
etc. |
Compute the estimates for dead forever, good-as-old, renewal, mixtures, missing data, and other models.
Output nonparametric, discrete, age-specific reliability and failure rate functions for all previous data, age intervals, and subsets. Figure 4 shows the failure rate function estimated for age at first failure. Because the returns did not specify whether returns were first, second, and so on, the failure rate function for age between subsequent returns was also estimated.
Figure 4 shows infant mortality, probably dead-on-arrival and returns shortly after sale, in months three and six.

Figure 4. Monthly failure rate for first failure estimated from typical ships and returns
CONCLUSIONS
Use the new benchmark to:
· Get rid of sample uncertainty if you are tracking only a sample
· Eliminate a mainframe computer and giant database software if you are tracking the population
· Eliminate millions of errors in tracking population returns
If you dont know your products and parts field reliability, start gathering your ships and returns data. Send ships and returns data to Larry, and hell send back reliability and failure rate estimates, free.
In an article proposing a National Field Reliability Applications Award [George], Larry lists 10 cost-effective ways to improve field reliability. Larry asserts that service operations can double their profits (cut costs in half if service is not a profit center), and nobody disputes that assertion. Do the best field reliability applications you can with available information.
REFERENCES
George, L. L. "A National Field reliability Award?" ASQ Reliability Review, Vol. 16, No. 3, 1996.
George, L. L. and A. Agrawal, "Estimation of a Hidden Service Distribution of an M/G/¥ Service System," Naval Research Logistics Quarterly, pp. 549-555, Vol. 20, No. 3, 1973.
Kaplan, E. L. and P. Meier, "Nonparametric Estimation from Incomplete Observations," J. Am. Statist. Assn., pp457-481, Vol. 53, 1958.
Krupp, James A. G., "Forecasting for the Automotive Aftermarket," The J. of Bus. Forecasting, pp8-12, winter 1993-94.
OConnor, Patrick D. T., David Newton, and Richard Bromley, Practical Reliability Engineering, third edition New York: John Wiley & Sons, 1996.
Plank, Tom, Accounting Deskbook, 10th Edition, New York: Prentice-Hall, 1995.
Tversky, Amos and D. Kahneman, "Judgment under Uncertainty: Heuristics and Biases," Science, pp. 1124-1131, Vol. 185, 1983.
About the Authors
Larry taught for 11 years, worked for Lawrence Livermore National Laboratory for 11 years, and has worked in the real world for more than 11 years. His goal is spreading knowledge and use of age-specific field reliability. He is sole proprietor of Problem Solving Tools.
Contact him at 1-925-447- 4969 or pstlarry@home.com.
Eva has her own editorial services business, Text Support, with clients in high-tech, publishing, and marketing.
Contact her at 1-925-314-9588 or eva@orconet.com.
|