HALT methodology really is a limit discovery tool, not a pass-fail test. Near and at the empirical, not theoretical or specified operational limits, provides some of the most useful data lies. It is the fastest way of finding weaknesses and for comparisons of electronics systems designs. Observing wide differences in operational limits between samples of the same product provides evidence of some component(s) inconsistent manufacturing processes affecting the system. If the deviation is large enough, the variation probably will affect operation of a smaller percentage of units at field use conditions. Discovery of variable empirical limits of multiple samples can be a discriminator for the quality of component and assembly process consistency. Wide deviation of operational limits between identical system samples is a good indicator of uncontrolled, possibly unknown, process variation that if wide enough will lead to failures in the intended use environment. Even if the numbers of units compared is not statistically significant, wide differences in limits are good qualitative indicators for reliability risks.
Stress testing at well below the operational limits even though it may be well beyond the end use specifications provides only very limited data on the product’s strength capability. Testing to only those “margins above spec” if not close to the empirical stress limit is just like watching a car race with a 120 mph speed limit. Some probability exists that a race car in this speed limited type of race could have a failure, and some cars would have failures and “lose” the race. Still failure would likely be rare and most of the vehicles would be tie for the win and there would be little differentiating information would be available for improving handling, durability or reliability over the competing cars. As in typical reliability testing, the cars much faster (higher stress) than most cars are driven and most accelerated reliability testing of electronic is performed at higher stresses than most systems will be exposed to in their useful life, and some percentage do fail in these milder but above spec stress conditions.
So why not test to empirical operation, and sometimes destruct, limits (i.e. HALT)? It is the quickest way to get useful data on product weaknesses. Why do so many resist testing electronics systems to empirical stress limits of voltage, temperature, vibration, shock, and other stresses that provide data on what the ultimate stress capability is? Here are just some of the reasons given in the last couple of decades:
1. Product failures above specified component stress specifications are “foolish failures”
2. Products in the field will never be subjected to those stress levels
3. The product is too expensive to destroy the samples
To briefly answer those reasons
1. All components have margins above specification and functional margins are very dependent on its application in the design, not individual component specifications. Why assume any failure is foolish before finding it. Not testing to the operational strength of the actual product is leaving what could be valuable data (and ultimately money) on the table
2. The product may not see the instantaneous stress levels used in the tests, but the cumulative fatigue damage of lower field stresses have a high probability of failing the same weakness in the design that is found at the destruct limits.
3. How expensive is a product failure to company and its customers? Finding out in the weaknesses in a test lab is almost always less costs than lost sales and warranty costs when a latent defect or weakness reaches the customers. There is a risk to all testing and to find weaknesses at limits, you risk catastrophic damage. In digital systems it is very difficult to destroy systems below thermal empirical operating limits due to the parametric shifts causing failures in signal integrity. Maybe it is because there are many that believe finding empirical limits results in a pile of melted solder, components and plastics. Vibration on the other hand will eventually cause a hard failure, where the operational limit is also a destruct limit. In any case, many times the unit can be repaired and re-used for additional testing.
In the reliability development of a new product we are somewhat like a person in an unfamiliar dark room. We really don’t know how big the room is until we bump into a wall, and actually several walls, to define the available space in the room. In electronics testing, until we find the actual empirical limits of stress, we do not know what the actual “stress” space is that can be used to find marginal functional or material issues. The larger the stress space, the faster we can find the strength “entitlement” and use that strength to find the one or two weaknesses in an electronics product that puts overall reliability at risk.
Just like the title of a song by the rock group the Eagles, we should in testing “Take it to the Limit” to fully benefit from each sample of electronics systems we test. You will find it takes fewer units, less time and money to find the few elements in a design that really could impact field reliability.