Assessing Data Independence and Normality for Statistical Process Control Charts

By Pat Valentine, PhD
Uyemura International Corporation
Southington CT

Abstract

The Weibull distribution is versatile and fairly easy to interpret. Understanding how to interpret the Weibull parameters is critical for making decisions on reliability data. The Weibull distribution can be used to model various applications in engineering, medical research, quality control, and finance. This paper reviews the Weibull distribution and provides a worked example of gold wire bonding on electroless nickel electroless palladium immersion gold (ENEPIG).

Weibull, reliability, wire bonding, ENEPIG

Introduction

The most commonly used distribution in reliability analysis may be the Weibull distribution. Its genesis can be traced back to 1937 when Waloddi Weibull invented it. Then, in 1951, Waloddi presented his hallmark paper to the American Society of Mechanical Engineers (ASME) on this subject, claiming that his distribution applied to a wide range of problems. He showed several examples, ranging from the fiber strength of cotton to the fatigue life of steel [1].

The Weibull distribution is versatile and fairly easy to interpret. It can be used to model a wide range of applications in engineering, medical research, quality control, and finance. For example, the distribution is frequently used with reliability analyses to model time-to-failure data. It is also used to model skewed process data in capability analysis.

The Weibull distribution may not always provide the best fit for a data set. It may not work as effectively for product failures caused by chemical reactions or degradation, e.g., corrosion. These situations can occur with semiconductor failures. Usually, in these types of situations, the lognormal distribution is used.

Weibull Distribution

The Weibull distribution is described by its Shape, Scale, and Threshold parameters. The simplest Weibull distribution has two parameters: the Shape or Beta (β) parameter and the Scale or Eta (η) parameter. This simpler Weibull distribution is known as the 2-parameter Weibull distribution. The more complex Weibull distribution has three parameters: the Shape or Beta (β) parameter, the Scale or Eta (η) parameter, and the Threshold parameter. Adding the Threshold parameter changes the nomenclature to the 3-parameter Weibull distribution.

The optional third parameter, the Threshold parameter, can be set to any value to move the lower bound. The Threshold parameter default set point is zero (0) because we can't have failures without first applying stress to the test items. Therefore, the Weibull distribution is bounded on the left tail, not taking on a value below zero. The Weibull distribution can take various forms depending on the parameters' values. A look at the three parameters:

Shape or Beta (β): This describes how the data are distributed and determines the Shape of the Weibull distribution. A Shape of three (3) approximates a normal distribution curve. A low value, say one (1), gives a right-skewed curve. A high value, say 10, gives a left-skewed curve; see Figure 1.

Figure 1. Weibull Shape (Beta) parameter.

Scale or Eta (η): Determines the spread of the Weibull distribution by defining the position of the Weibull curve relative to the Threshold. A larger Scale value stretches the distribution, while a smaller Scale value squeezes the distribution. For the Weibull distribution, the Scale is always the 63.21 percentile of the distribution of possible values. For example, a Scale of 15 indicates that 63.21% of the parts will fail in the first 15 units of stress (e.g., hours, cycles, grams, etc.) after the Threshold time, see Figure 2.

Figure 2. Weibull Scale (eta) parameter.

Threshold: A fixed characteristic of the population distribution that provides a parameter of the earliest possible failure time. The Threshold parameter describes the shift of the distribution away from zero. All data must be greater than the Threshold parameter. The 2-parameter Weibull distribution is the same as the 3-parameter Weibull with a Threshold of zero. For example, the 3-parameter Weibull (3,75,40) has the same Shape and spread as the 2-parameter Weibull (3,75) but is shifted 40 units to the right, see Figure 3.

Figure 3. Weibull Threshold parameter.

Because of its versatility, the Weibull distribution can model all three stages of a product's life cycle: Infant Mortality, Design Life, and Wear-out. The Weibull ‘bathtub curve’ is a conceptual model for describing reliability-related phenomena at the component level over a product's life cycle, see Figure 4. Each life cycle stage is defined below.

Figure 4. Bathtub curve with its three life cycle stages.

Infant Mortality (β < 1): Failures that occur early during reliability testing are caused by manufacturing defects such as poor quality, out-of-spec processing, substandard manufacturing and assembly practices, etc.

Design Life (β = 1): FA constant failure rate (constant % fails among those units still ‘alive’) with random failures (independent of time) due to excessively high loads, environmentally induced stresses, etc.

Wear-out, Early (1 < β < 4): Early wear-out begins with failures of weaker items and progresses to wear-out of stronger parts as β moves from one to four. Failures occur due to low cycle fatigue, corrosion, aging, friction, etc.

Wear-out, Rapid (β > 4): These are highly reliable products, but failures occur due to fatigue, corrosion, aging, friction, etc.

It is easy to see how these three parameters can be manipulated to fit various problems, just as Waloddi Weibull stated in 1951 [1]. The primary advantage of Weibull analysis is the ability to provide reasonably precise failure analysis and time-to-fail forecasts with extremely small sample sizes [2].

Interpretation of the Shape (β) and Scale (η) Parameters

The interpretation of the Shape (β) and Scale (η) parameters must be taken in reference to the desired reliability of the parts and their design life. Large Shapes (β) within the design life are a source of concern as there is a risk of the entire population failing quickly as the parts age into the wear-out part of the life cycle. On the other hand, if the Weibull Scale value (η) is well beyond the minimal desired design life, there is a negligible probability of failure before part retirement. Large Shapes (β) are a source of happiness in this case. Most large Shapes (β) have a safe period before the onset of failures, where the probability of failure is negligible. The larger the Shape (β) for a given Scale (η) value, the smaller the variation in times to failure and the more predictable the results of the product (tighter manufacturing controls). A vertical Shape (β) of infinity implies perfect design, manufacturing, and quality control [2].

Assessing the Weibull Plot Fit

There are three typical goodness-of-fit measures used to assess the fit:

Anderson-Darling (AD) Adjusted Statistic: A relative goodness-of-fit measure for the selected distribution that can be used to compare its fit to the data compared to the fit provided by alternate distributions. One can compare the AD statistic for several distributions with the same number of parameters; smaller AD values indicate a better fit. However, to conclude that one distribution is the best, its AD statistic must be substantially lower than the others. When the AD statistics are close together, you should use the probability plots to choose between them [3, 4, 5].

Correlation Coefficient (Pearson’s): Measures the strength of the linear relationship between the data set and the chosen distribution. Suppose the distribution fits the data well when plotting points on a probability plot graph. In that case, these points will fall almost on a straight line (superimposed on, or very close to, the diagonal line), and the correlation coefficient will approach one (1). This statistic’s maximum value is one. The idea of a probability plot correlation coefficient came from James Filliben, who also discussed extensions of the test for non-normal distributional hypotheses [6]. Vogel completed additional probability plot correlation coefficient work on several distributions [7].

Visual examination – Probability Plot for Fit and Competing Failure Modes: The data points (failure times) should line up close to a diagonal line, indicating a good fit. The origin of the blue diagonal line comes from the Shape (β) and Scale (η) parameters iteratively estimated from the data set and then plotted. The percent (y-axis) follows the Shape (β) parameter, with the Scale (η) at the 63.21% percentile, see Figure 5.

Figure 5. Weibull probability plot fit.

The plotted data in Figure 5 has an excellent fit with a correlation coefficient of 0.980. The Anderson-Darling (AD) statistic is relative. Still, it would be of significant value if we compared the Weibull distribution with, say, the Lognormal distribution. Then, whichever distribution had a significantly smaller AD statistic would indicate a better fit. During visual analysis, one needs to look for competing failure modes, e.g., dog legs, cusps, and corners, as these indicate two or more competing failure modes present, see Figure 6. It is generally best practice to analyze multiple failure modes separately when they behave differently

Figure 6. Competing failure modes.
Note: for illustrative purposes only; some patterns may not be mathematically possible.

Reliability Testing

Probability is a branch of mathematics that deals with the occurrence of a random event. P-values are expressed from zero (0) to one (1) and are the probability of an event having occurred or occurring in the future. Reliability is the probability (0 ~ 1) of a product performing its intended function over its specified usage period and under specified operating conditions that meet or exceed customer expectations. Reliability is simply quality over time.

Reliability theory developed apart from the mainstream of probability and statistics. It was used primarily as a tool to help nineteenth-century maritime and life insurance companies compute profitable rates to charge their customers [3]. In today’s technological world, nearly everyone depends upon the continued functioning of complex machinery and equipment [3].

In reliability testing and analysis, any time there are actual failures, either with or without censored data, we will use an empirical distribution for analysis and inference about the population. An empirical distribution is the distribution function associated with the actual measure of a sample.

Censored data means we do not know exact failure times (e.g., the reliability testing was stopped before a failure occurred). We call this ‘right’ censored data. The data value is coded; '0' is usually used for censored values, and ‘1’ is used for actual failure times.

A Worked Example

Electroless nickel electroless palladium immersion gold (ENEPIG) is a standard printed circuit board final finish. The electroless palladium layer provides a base for gold wire bonding. The ENEPIG process engineer is qualifying their new final finish line. The process engineer will evaluate 1-mil gold wire bonding (ball and stitch) as part of the qualification. The process engineer plates samples under controlled conditions and collects wire bond pull data, see Table 1. The process engineer creates a Weibull probability plot and assesses the fit, see Figure 7.

Table 1. 1-mil gold wire bond pull data.

Table 1

Figure 7. Weibull probability plot.

The plotted data in Figure 7 has an excellent fit with a correlation coefficient of 0.991. The Anderson-Darling (AD) statistic is relative. Still, it would be of significant value if we compared the Weibull distribution with, say, the Lognormal distribution. During visual analysis, the process engineer looks for competing failure modes (dog legs, cusps, and corners), but none exist. The data points (failure times) line up close to a diagonal line, indicating a good fit. The process engineer reviews and is satisfied with the fit. Next, the process engineer begins to interpret the Shape (β) and Scale (η) parameters in reference to the desired reliability of the parts and their design life.

The Shape parameter is 31.5 (rapid wear-out), the Scale parameter is 9.86 (grams), and the Threshold parameter is zero (a 2-parameter Weibull distribution). The process engineer references MIL-STD-883K Condition D with 1-mil gold wire, pre-seal, which requires a minimum bond force of 3 grams [8]. The Weibull Scale value of 9.86 grams is well beyond the minimal desired reliability of three grams, so there is a negligible probability of failure before part retirement. In this case, the enormous Shape value of 31.5 is a source of happiness due to the smaller variation in gram pull forces. In other words, the more predictable the results of the ENEPIG and the wire bonding processes (tighter manufacturing controls). Lastly, for ease of interpretation, the process engineer creates a Weibull probability distribution plot using the Shape parameter of 31.5 and the Scale parameter of 9.86, see Figure 8.

Figure 8. Weibull probability distribution plot.

Conclusions

The genesis of the Weibull distribution can be traced back to 1937. It is a versatile distribution that can be used to model various applications in engineering, medical research, quality control, and finance. The Weibull distribution is described by its Shape, Scale, and Threshold parameters. Because of its versatility, the Weibull distribution can model all three stages of a products life cycle: infant mortality, design life, and wear-out stages. Three typical goodness-of-fit measures are used to assess the Weibull fit: the Anderson-Darling statistic, the correlation coefficient (Pearson’s), and a visual examination of the probability plot. In today's technological world, nearly everyone depends upon the continued functioning of complex machinery and equipment. Reliability is simply quality over time.

References

[1] Weibull, “A Statistical Distribution Function of Wide Applicability,” ASME Applied Journal of Mechanics, pp. 293-297, 1951.

[2] Robert B. Abernethy, The New Weibull Handbook 5^th Ed,

[3] NIST Engineering Statistics Handbook, 2012. https://www.itl.nist.gov/div898/handbook/

[4] Ryan, Modern Engineering Statistics, 2007.

[5] Gary, S. Wasserman, Reliability Verification, Testing, and Analysis in Engineering Design, 2003.

[6] James, J. Filliben, “The Probability Plot Correlation Coefficient Test for Normality,” Technometrics, vol. 17, no. 1, pp. 111-117, 1975.

[7] Richard, M. Vogal, “The Probability Plot Correlation Coefficient Test for the Normal, Lognormal, and Gumbel Distributional Hypotheses,” Water Resources Research, vol. 22, no. 4, pp. 587-590, 1986.

[8] MIL-STD-883K, 2017.

Biography

Patrick Valentine is the Technical and Lean Six Sigma Manager for Uyemura USA. He teaches Six Sigma Green Belt and black belt courses as part of his responsibilities. He holds a Doctorate in Quality Systems Management from Cambridge College and ASQ certifications as a Six Sigma Black Belt and Reliability Engineer. Patrick can be contacted at pvalentine@uyemura.com

Using the Weibull Distribution
to Model Reliability Data

Abstract

Introduction

Weibull Distribution

Interpretation of the Shape (β) and Scale (η) Parameters

Assessing the Weibull Plot Fit

Reliability Testing

A Worked Example

Conclusions

References

Biography

Powered by science
Focused on customers

Uyemura finishes lead the world in plating performance.

Using the Weibull Distribution to Model Reliability Data

Abstract

Introduction

Weibull Distribution

Interpretation of the Shape (β) and Scale (η) Parameters

Assessing the Weibull Plot Fit

Reliability Testing

A Worked Example

Conclusions

References

Biography

Powered by science Focused on customers

Uyemura finishes lead the world in plating performance.

Using the Weibull Distribution
to Model Reliability Data

Powered by science
Focused on customers