Notes on Statistical Methods for Reliability Data

Notes on Statistical Methods for Reliability

Data

Qianqian Shan

Contents

1 Reliability Concepts and Reliability Data 7

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 Genral Models for Reliability Data . . . . . . . . . . . . . . . 7

1.2.1 Target Population or Process . . . . . . . . . . . . . . 7

1.2.2 Causes of Failure and Degradation Leading to Failure . 8

1.2.3 Environment Eﬀects on Reliability . . . . . . . . . . . 8

1.2.4 Time Scale . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Repairable Systems and Nonrepairable Units . . . . . . . . . . 9

1.4 Strategy for Data Collection, Modeling and Analysis . . . . . 9

1.4.1 Planning a Reliability Study . . . . . . . . . . . . . . . 9

1.4.2 Strategy for Data Analysis and Modeling . . . . . . . . 10

2 Models, Censoring, and Likelihood for Failure-Time Data 11

2.1 Models for Continuous Failure-Time Process . . . . . . . . . . 11

2.2 Models for Discrete Data From a Continuous Process . . . . . 14

2.3 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 Censoring Mechanism . . . . . . . . . . . . . . . . . . 14

2.3.2 Assumptions on Censoring Mechanism . . . . . . . . . 15

2.4 Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Non-parametric Estimation 16

3.1 Estimation from Singly Censored Interval Data . . . . . . . . 16

3.2 Basic Ideas of Statistical Inference . . . . . . . . . . . . . . . . 16

3.3 Conﬁdence Intervals from Complete or Singly Censored Data . 16

3.3.1 Point-wise Binomial-based Conﬁdence Interval for F (t

) 16

3.3.2 Pointwise Normal-Approximation Conﬁdence Interval

for F (t

) . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4 Estimation from Multiply Censored Data . . . . . . . . . . . . 17

3.5 Pointwise Conﬁdence Intervals from Multiply Censored Data . 17

CONTENTS

3.5.1 Approximate Variance of

F (t

) . . . . . . . . . . . . . . 17

3.5.2 Greenwood’s Formula . . . . . . . . . . . . . . . . . . . 18

3.5.3 Pointwise Normal-Approximation Conﬁdence Interval

for

F (t

) . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.6 Estimation from Multiply Censored Data with Exact Failures 18

3.7 Simulataneous Conﬁdence Bands . . . . . . . . . . . . . . . . 19

3.8 Uncertain Censoring Times . . . . . . . . . . . . . . . . . . . 19

4 Loaction-Scale-Based Parametric Distributions 19

4.1 Quantities of Interest in Reliability Applications . . . . . . . . 19

4.2 Location-scale and Log-location-scale Distributions . . . . . . 20

4.3 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . 21

4.4 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . 21

4.5 Log-normal Distribution . . . . . . . . . . . . . . . . . . . . . 21

4.6 Smallest Extreme Value Distribution . . . . . . . . . . . . . . 21

4.7 Weibull Distribution . . . . . . . . . . . . . . . . . . . . . . . 21

4.8 Largest Extreme Value Distribution . . . . . . . . . . . . . . . 22

4.9 Logistic Distribution . . . . . . . . . . . . . . . . . . . . . . . 22

4.10 Log-logistic Distribution . . . . . . . . . . . . . . . . . . . . . 22

4.11 Generating Pseudo-random Observations from a Speciﬁed Dis-

tribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.11.1 Pseudo-random Observations from Continuous Distri-

butions . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.11.2 Pseudo-random Observation from Discrete Distributions 22

4.11.3 Eﬃcient Generation of Censored Pseudorandom Samples 23

5 Other Parametric Distributions 23

6 Probability Plotting 24

6.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.2 Linearing a CDF . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.3 Probability Plotting Positions . . . . . . . . . . . . . . . . . . 24

6.3.1 Criteria for Choosing Plotting Positions . . . . . . . . 24

6.3.2 Choice of Plotting Positions . . . . . . . . . . . . . . . 25

6.4 Probability Plots with Speciﬁed Shape Parameter . . . . . . . 25

7 Parametric Likelihood Fitting Concepts: Exponential Distri-

bution 26

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

7.2 Parametric Likelihood . . . . . . . . . . . . . . . . . . . . . . 26

7.3 Conﬁdence Intervals for θ . . . . . . . . . . . . . . . . . . . . 26

CONTENTS

7.3.1 Likelihood Conﬁdence Intervals for θ . . . . . . . . . . 26

7.3.2 Normal-Approximation Conﬁdence Intervals for θ . . . 27

7.4 Conﬁdence Intervals for Function of θ . . . . . . . . . . . . . . 27

7.5 Likelihood for Exact Failure Times . . . . . . . . . . . . . . . 28

7.5.1 Correct Likelihood for Observations Reported as Exact

Failures . . . . . . . . . . . . . . . . . . . . . . . . . . 28

7.5.2 Using Density Approximation for Observations Reported

as Exact Values . . . . . . . . . . . . . . . . . . . . . . 28

7.6 Data Analysis with No Failures . . . . . . . . . . . . . . . . . 28

8 Maximum Likelihood for Log-Location-Scale Distributions 29

8.1 Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

8.2 Likelihood Conﬁdence Regions and Intervals . . . . . . . . . . 29

8.2.1 Joint Conﬁdence Regions for µ and σ . . . . . . . . . . 29

8.2.2 Individual Conﬁdence Intervals for µ and σ . . . . . . . 29

8.2.3 Likelihood Conﬁdence Intervals for Functions of µ and σ 29

8.3 Normal Approximation Conﬁdence Intervals . . . . . . . . . . 30

8.4 Estimation with Given σ . . . . . . . . . . . . . . . . . . . . . 30

9 Bootstrap Conﬁdence Intervals 30

9.1 Bootstrap Sampling . . . . . . . . . . . . . . . . . . . . . . . . 31

9.2 Conﬁdence Intervals . . . . . . . . . . . . . . . . . . . . . . . 31

9.3 Percentile Bootstrap Method . . . . . . . . . . . . . . . . . . . 32

10 Planning Life Test 33

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

10.2 Approximate Variance of ML Estimators . . . . . . . . . . . . 34

10.2.1 Basic Large-Sample Approximation . . . . . . . . . . . 34

10.3 Sample Size for Unrestricted Functions . . . . . . . . . . . . . 35

11 Parametric Maximum Likelihood: Other Models 35

11.1 Truncated Data and Truncated Distributions . . . . . . . . . . 35

11.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

12 Prediction of Future Random Quantiles 36

12.1 Probability Prediction Intervals(θ Given) . . . . . . . . . . . . 36

12.2 Statistical Prediction Interval(θ Estimated) . . . . . . . . . . 37

12.2.1 Coverage Probability Concepts . . . . . . . . . . . . . 37

12.2.2 Naive Method for Computing a Statistical Prediction

Interval . . . . . . . . . . . . . . . . . . . . . . . . . . 37

12.3 The (Approximate) Pivotal Method for Prediction Intervals . 38

CONTENTS

12.3.1 Type II (Failure) Censoring . . . . . . . . . . . . . . . 38

12.3.2 Type I Censoring . . . . . . . . . . . . . . . . . . . . . 38

13 Degradation Data, Models, and Data Analysis 38

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

13.2 Models for Degradation Data . . . . . . . . . . . . . . . . . . 39

13.2.1 Degradation Data . . . . . . . . . . . . . . . . . . . . . 39

13.2.2 Degradation Leading to Failure . . . . . . . . . . . . . 39

13.2.3 Models for Variation in Degradation and Failure Times 41

13.2.4 Limitations of Degradation Data . . . . . . . . . . . . 41

13.2.5 General Degradation Path Model . . . . . . . . . . . . 41

13.2.6 Degradation Model Parameters . . . . . . . . . . . . . 42

13.3 Estimation of Degradation Model Parameters . . . . . . . . . 42

13.4 Models Relating Degradation and Failure . . . . . . . . . . . . 42

13.4.1 Soft Failures: Speciﬁed Degradation Level . . . . . . . 42

13.4.2 Hard Failures: Joint Distribution of Degradation and

Failure Level . . . . . . . . . . . . . . . . . . . . . . . 43

13.5 Evaluation of F (t) . . . . . . . . . . . . . . . . . . . . . . . . 43

13.5.1 Analytic Solution for F (t) . . . . . . . . . . . . . . . . 43

13.5.2 Numerical Evaluation of F (t) . . . . . . . . . . . . . . 44

13.5.3 Monte Carlo Evaluation of F (t) . . . . . . . . . . . . . 44

13.6 Estimation of F (t) . . . . . . . . . . . . . . . . . . . . . . . . 44

13.7 Bootstrap Conﬁdence Intervals . . . . . . . . . . . . . . . . . 44

13.8 Comparison with Traditional Failure-Time Analysis . . . . . . 46

13.9 Approximate Degradation Analysis . . . . . . . . . . . . . . . 46

14 Introduction to the Use of Bayesian Methods for Reliability

Data 47

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

14.2 Using Bayes Rule to Update Prior Information . . . . . . . . . 47

14.3 Prior Information and Distributions . . . . . . . . . . . . . . . 47

14.3.1 Noninformative(diﬀuse) Prior Distributions . . . . . . . 47

14.3.2 Using Past Data to Specify a Prior Distribution . . . . 47

14.3.3 Expert Opinion and Eliciting Prior Information . . . . 48

14.4 Numerical Methods for Combining Prior Information with a

Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

14.4.1 Simulation-based Methods for Computing the Poste-

rior Distribution of θ . . . . . . . . . . . . . . . . . . . 48

14.4.2 Marginal Posterior Distributions . . . . . . . . . . . . . 48

14.5 Using The Posterior Distribution for Estimation . . . . . . . . 48

14.5.1 Bayesian Point Estimation . . . . . . . . . . . . . . . . 48

CONTENTS

14.5.2 Bayesian Interval Estimation . . . . . . . . . . . . . . . 49

14.6 Bayesian Prediction . . . . . . . . . . . . . . . . . . . . . . . . 49

14.6.1 Bayesian Posterior Predictive Distribution . . . . . . . 50

14.6.2 Approximating Posterior Predictive Distribution . . . . 50

14.6.3 Posterior Predictive Distribution for the kth Failure

from a Future Sample of Size m . . . . . . . . . . . . . 50

14.7 Practical Issues in the Application of Bayesian Methods . . . . 50

14.7.1 Comparison Between Bayesian and Likelihood/Frequen-

tist Statistical Methods . . . . . . . . . . . . . . . . . . 50

14.7.2 Caution on the Use of Prior Information . . . . . . . . 51

15 System Reliability Concepts and Methods 51

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

15.2 System Structures and System Failure Probability . . . . . . . 52

15.2.1 Time Dependency of System Reliability . . . . . . . . . 52

15.2.2 System with Component in Series . . . . . . . . . . . . 52

15.2.3 System with Components in Parallel . . . . . . . . . . 53

15.2.4 Systems with Components in Series-Parallel . . . . . . 53

15.2.5 Bridge System Structure . . . . . . . . . . . . . . . . . 54

15.2.6 k-Out-of-s System Structures . . . . . . . . . . . . . . 54

15.3 Estimating System Reliability From Component Data . . . . . 54

16 Analysis of Repairable System and Other Recurrence Data 54

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

16.1.1 Repairable System Reliability Data and Other Recur-

rence Data . . . . . . . . . . . . . . . . . . . . . . . . . 54

16.1.2 A Nonparametric Model for Recurrence Data . . . . . 55

16.2 Non-Parametric Estimation of The MCF . . . . . . . . . . . . 56

16.2.1 Non-Parametric Model Assumptions . . . . . . . . . . 56

16.2.2 Point Estimate of the MCF . . . . . . . . . . . . . . . 56

16.2.3 Standard Errors and Non-parametric Conﬁdence Inter-

vals for the MCF . . . . . . . . . . . . . . . . . . . . . 57

16.2.4 Adequacy of Normal-Approximation Conﬁdence Inter-

vals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

16.3 Non-Parametric Comparison of Two Samples of Recurrence

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

16.4 Parametric Models for Recurrence Data . . . . . . . . . . . . . 59

16.4.1 Poisson Process . . . . . . . . . . . . . . . . . . . . . . 59

16.4.2 Homogeneous Poisson Process(HPP) . . . . . . . . . . 59

16.4.3 Non-homogeneous Poisson Process(NHPP) . . . . . . . 59

16.4.4 Renewal Process . . . . . . . . . . . . . . . . . . . . . 60

CONTENTS

16.4.5 Superimposed Renewal Process . . . . . . . . . . . . . 60

16.5 Tools for Checking Point-Process Assumptions . . . . . . . . . 61

16.5.1 Tests for Recurrence Rate Trend . . . . . . . . . . . . 61

16.5.2 Test for Independent Interrecurrence Times . . . . . . 62

16.6 Maximum Likelihood Fitting of Poisson Process . . . . . . . . 62

16.6.1 Poisson Process Likelihood . . . . . . . . . . . . . . . . 62

16.6.2 Superimposed Poisson Process Likelihood . . . . . . . 63

16.6.3 ML Estimation for the Power NHPP and Loglinear

NHPP . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

16.6.4 Conﬁdence Intervals for Parameters and Functions of

Parameters . . . . . . . . . . . . . . . . . . . . . . . . 63

16.6.5 Prediction of Future Recurrences with a Poisson Process 63

16.7 Generating Pseudo-Random Realizations from An NHPP Pro-

cess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

16.7.1 General Approach . . . . . . . . . . . . . . . . . . . . . 63

17 Failure-Time Regression Analysis 64

1 RELIABILITY CONCEPTS AND RELIABILITY DATA

1 Reliability Concepts and Reliability Data

1.1 Introduction

Deﬁnition of Reliability: the probablity that a system, vehicle, machine

etc will perform its intended function under operating conditions, for a spec-

iﬁed period of time.

Reliability is quality over time.

Features of Reliability Data:

1. Typically censored due to the frequent need to analyze life test data

before all units are failed.

2. Most reliability data are modeled using distributions for positive

random variables such as exponential, Weibull . . .

3. Inferences and predictions involving extrapolation are often required.

4. It’s often necessary to use past experience or other scientiﬁc or

engineering judgement to provide infomation as input to the analysis

of data or to a decision makeing process.

5. The traditional parameters (such as mean and standard deviation) of

a statistical model are typically not of primary interest. Instead,

speciﬁc measures of product reliability or particular characteristics of

a failure time distribution (eg. quantiles, rates) are of more interest.

6. Model ﬁtting requires computer implementation of numerical

methods, and often there is no exact theory for statistical inferences,

especially with censored data.

1.2 Genral Models for Reliability Data

1.2.1 Target Population or Process

Deming(1975) stated that statistical studies can be broadly divided into

two diﬀerent categories:

1. Enumerative studies answer questions about populations

that consist of a ﬁnite set of identiﬁable units. Typically, the

study is conducted by randomly selecting a sample from the

1 RELIABILITY CONCEPTS AND RELIABILITY DATA

population, carefully evaluating the units in the sample and then

making an inference about the larger population from which the

sample is taken.

Assumption: the sampling frame accurately represents the units in

hte population.

2. Anlytic studies answer questions about processes that

generate units or other output over time. Interest might center

on the life distribution of electric motors that will be produced in the

future.

Assumption: The process will behave in the future the same as it

has in the past.

1.2.2 Causes of Failure and Degradation Leading to Failure

Many failure modes can be traced to some underlying degradation process.

For example, automotive brake pads wear with use.

Traditionaly, most statistical studies of product reliability have been based

on failure time data, while the actual level of degradation can also be very

helpful, especially when there are few or no failures.

Not all failrues can be traced to degradation, though. Some failures can be

caused by sudden accidents.

Understanding the physical and chemical mechanisms and random risk

factors leading to failure can suggest methods for eliminating failure modes

or reducing the probablity of a failure mode, thereby improving reliability.

1.2.3 Environment Eﬀects on Reliability

Environmental factors play an important role in product reliability. A large

proportion of product reliability problems result from unanticipated failure

modes caused by environmental eﬀects that were not part of the initial

reliability evaluation system.

One challenge of product design is to discover and develop economical

means of building in robustness to environmental and other factors that

manufactures and users are unable to control.

1 RELIABILITY CONCEPTS AND RELIABILITY DATA

1.2.4 Time Scale

The life of many products can be measured in more than one scale. For

example, life would be measured in number of cycles for factory life tests of

products such as wahsing machines and toasters, while time in service data

is more commonly available.

The choice of a time for measuring product life is often suggested by an

underlying process leading to failure, even if the degradation cannot be

observed directly.

There are a number of methods that can be used to handle data with more

than one life measurements. For example, we measure battery age and

number of charge/discharge cycles. We can 1) estimate the eﬀects of both

factors and develop a measure of battery life as a function of both or 2)

develop a model that use the number of cycles to help explain the

variability in the time to failure.

1.3 Repairable Systems and Nonrepairable Units

There are two situations regarding to failure:

1. The time of failure for nonrepairable units or components, or time to

ﬁrst failure of a system.

2. A sequence of reported system-failure times for a repairable system.

A system may have both repairable and nonrepairable components.

1.4 Strategy for Data Collection, Modeling and

Analysis

Reliability studies involving laboratory experimentation or ﬁeld tracking

require carefully planning and excution. Mistakes can be costly in terms of

materials and time, and there is also a possibility of drawing erroneous

conclusion.

1.4.1 Planning a Reliability Study

1. Deﬁne problem to be solved and questions to be answered.

2. Consider the resources available for the study(time, money,equipment

etc.).

1 RELIABILITY CONCEPTS AND RELIABILITY DATA

3. Design the experiment or study with an assessment of precision of

estimates as a function of the size of the study. As estimation of

precision generally depends on unknow model parameters, assessment

will require planning values of unknown population and process

characteristics.

It’s often useful to conduct a pilot study to obtain information

needed when little is known about the target population of process.

4. In new situations, it’s useful, before the test, to conduct a trial

analysis of data simulated from a proposed model suggested from

available information/previous experience. See more details at

Chapter 10 , 20 and section 22.5.

1.4.2 Strategy for Data Analysis and Modeling

General strategy:

1. Begin the analysis by looking at the data without make any

distributional or other strong model assumptions, i.e., look at the

model without any possible distortion.

The primary tool for this step is graphical analysis.

2. If possible, ﬁt one or more parametric models to the data for the

purpose of description, estimation or prediction.

3. Examine appropriate diagnostics and other use other tools to assess

the adequacy of model assumptions before makeing any estimation or

prediction.

Especially when there is little data, it may be diﬃcult to detect

departures from model assumptions, but this just means that we have

no strong evidence against the assumption but NOT means that the

assumption can be trusted.

4. Proceed with caution to esimate parameters and make predictions if

step 3 is ﬁne. The estimates and predictions usually come with

statistical intervals to reﬂect uncertainty and variability.

5. Display results of the analysis graphically, including estimates or

predictions with uncertainty bounds.

6. It’s possible to use the results to draw conclusions about the

reliability, contigent on particular model asssumptions.

2 MODELS, CENSORING, AND LIKELIHOOD FOR FAILURE-TIME DATA

If assumptions that data provide little information to assess their

adequacy, it’s useful to vary assumptions and assess the impact of

such perturbations have on ﬁnal answers.

This sensitivity analyses should be reported along with conclusions.

2 Models, Censoring, and Likelihood for

Failure-Time Data

2.1 Models for Continuous Failure-Time Process

The most widely used metric for reliability of a product is its failure-time

distribution and the most commonly collected reliability data.

Cumulative distribution function, probability density function, survival

function and hazard function are used to characterize the probability

distribution for failure time T .

1. Cumulative Density Function(CDF): F (t) = P (T ≤ t) is the cdf

of T. It’s the proportion of the units (will) fail before time t.

2. Probability Density Function(PDF): the derivative of cdf. It

represents the relative frequency of failure times as a function of time.

3. Survival Function(SF): also called as the reliability function, the

complement of the cdf. S(t) = 1 − F (t). It represents the probability

of surviving beyond time t.

4. Hazard Function(HF), also called hazard rate, the instantaneous

failure rate function.

h(t) = lim

∆t→0

P (t < T < t + ∆t|T > t)

∆t

(2.1)

= lim

∆t→0

P (t < T < t + ∆t)

∆t · P (T > t)

(2.2)

f(t)

1 − F (t)

(2.3)

[4.1] Cumulative Hazard Function: H(t) =

h(x)d(x), and

F (t) = 1 − exp(−H(t)) = 1 − exp(−

h(x)d(x).

2 MODELS, CENSORING, AND LIKELIHOOD FOR FAILURE-TIME DATA

[4.2] Average Hazard Rate: AHR(t

, t

) =

h(u)d(u)

−t

H(t

)−H(t

)

−t

is the hazard rate over the interval and it’s the approximate fraction

failing per unit time over the speciﬁed interval.

[4.3] Hazard rate in FITs: A ﬁt(failure in time) rate is deﬁned as

the hazard function in units of 1/hours, and multiplied by 10

, it’s

commonly used in high reliability electronics applications.

2 MODELS, CENSORING, AND LIKELIHOOD FOR FAILURE-TIME DATA

Figure 1: The Four Functions

2 MODELS, CENSORING, AND LIKELIHOOD FOR FAILURE-TIME DATA

5. Quantile Function:the p quantile of F (t) is the smallest t satisfying

P (T ≤ t) = F (t) ≥ p.

2.2 Models for Discrete Data From a Continuous

Process

1. Partition the time line (0, ∞) into (m+1) observation intervals:

, t

], (t

, t

], ··· , (t

, t

m+1

], where t

m+1

= ∞andt

= 0.

2. Deﬁne multinomial probability that a unit will fail in interval i as

= P (t

i−1

< T ≤ t

) = F (t

) − F (t

i−1

), where π

≥ 0 and

m+1

j=1

= 1.

3. Then the conditional probability that a unit will fail in interval i

given that the unit was still operating at the beginning of the interval

i: p

= P (t

i−1

< T ≤ t

|T > t

i−1

) =

S(t

i−1

)

4. The survival function is then

S(t

) =

j=1

[1 − p

], i = 1, ··· , m + 1. (2.4)

5. The cdf at time t

F (t

) = 1 −

j=1

[1 − p

] =

j=1

. (2.5)

2.3 Censoring

2.3.1 Censoring Mechanism

Type I censoring:remove all unfailed units from the test at a

prespeciﬁed time is know as “time censoring” .

Type II censoring: terminate a test after a speciﬁed number of failures

is known as “failure censoring”.

Interval censoring: failures are only observed when inspection conducted,

thus interval-censored observations have upper and lower bound on a failure

time, also know as inspection data, grouped data, or read-out data.

Random right censoring: some products may have more than one causes

of failures and if the primary interest in focused on one particular cause,

2 MODELS, CENSORING, AND LIKELIHOOD FOR FAILURE-TIME DATA

then the failures from other causes can , in some situations , be viewed as

random right censored data.

Systematic multiple censoring:censoring due to staggered entry of

units. The data may be analyzed when not all units have failure, and the

data will usually be multiply right-censored with some failure times

exceeding some of the running times.

2.3.2 Assumptions on Censoring Mechanism

1. The censoring time of a unit depend ONLY on the history of the

observed failure-time process.

2. Censoring is non-informative, i.e., the censoring times of units provide

no information about the failre-time distribution, this is related to the

ﬁrst assumption.

2.4 Likelihood

The total likelihood , or joint probability of the data for n independent

observations including left-censored, interval-censored and right-censored

data:

L(p; DAT A) = C ·

m+1

i=1

[F (t

)]

· [F (t

) − F (t

i−1

)

] · [1 − F (t

)]

. (2.6)

where n =

m+1

j=1

+ `

+ r

) and C is a constant which is usually taken as

Likelihood for Random Censoring in the Intervals

(p; data

) = {P [(T ≤ C) ∩ (t

i−1

< T ≤ t

)]}

{P [(C ≤ T ) ∩ (t

i−1

< C ≤ t

)]}

i−1

(t)[1 − F

(t)]dt

i−1

(t)[1 − F

(t)]dt

for r

right censored observations, and d

failures in (t

i−1

, t

3 NON-PARAMETRIC ESTIMATION

3 Non-parametric Estimation

3.1 Estimation from Singly Censored Interval Data

The non-parametric estimator

F (t) based on binomial distribution:

F (t) =

#offailuresuptotimet

j=1

. (3.1)

where n is the initial number of units, d

is the units that failed/died in

interval (t

i−1

, t

3.2 Basic Ideas of Statistical Inference

The estimates of

F (t) can be interpreted beyond the particular sample

units, and used to make inference about the process or larger existing

population of units from which the sample units are choosen randomly

(Inferential Statistics).

When repeating the sampling and estimation a lot of times, the sampling

distribution of of

F (t

) can be produced, which can provide insight into the

possible deviation between

F (t

) and F (t

Conﬁdence intervals are one of the most useful ways of quantifying the

uncertainty due to “sampling error” arising from limited sample sizes.

Converage probability is the probability that a conﬁdence interval

procedure will result in an interval containing the quantity of interest.

3.3 Conﬁdence Intervals from Complete or Singly

Censored Data

3.3.1 Point-wise Binomial-based Conﬁdence Interval for F (t

)

A conservative 100(1 − α)% conﬁdence interval [F

∼

F ] is:

∼

) =

1 +

(n − n

F + 1)F

1−α/2;2n−2n

F +2,2n

−1

. (3.2)

F (t

) =

1 +

(n − n

F )

F + 1)F

1−α/2;2n

F +2,2n−2n

−1

. (3.3)

where

F is

F (t

), F is the F distribution.

3 NON-PARAMETRIC ESTIMATION

3.3.2 Pointwise Normal-Approximation Conﬁdence Interval for

F (t

)

An approximate 100(1 − α)% conﬁdence interval for F (t

) is

∼

F (t

)] =

F (t

) ± z

1−α/2

ˆse

. (3.4)

where ˆse

F (t

)[1 −

F (t

)]/n.

3.4 Estimation from Multiply Censored Data

The size of the risk set at the beginning of interval i is:

= n −

i−1

j=0

−

i−1

j=0

(3.5)

where d

is the number of units that died/failed in the j

interval (t

i−1

, t

and r

is the number of units censored at t

Then the estimator of the conditional probability of failing in interval i

given the unit enters this interval:

ˆp

(3.6)

The survival function and non-parametric estimator of F (t

) can be

obtained through (2.4).

3.5 Pointwise Conﬁdence Intervals from Multiply

Censored Data

3.5.1 Approximate Variance of

F (t

)

V ar(

F (t

)) = V ar(

S(t

)) as they sum to a constant 1. The approximate

variance can be obtained by ﬁrst-order Taylor series expansion:

S(t

) ≈ S(t

) +

j=1

∂S

∂q



(ˆq

− q

), (3.7)

where q

= 1 − p

and q

values are approximately uncorrelated binomial

proportions.

3 NON-PARAMETRIC ESTIMATION

Then

V ar(

F (t

)) = V ar(

S(t

)) ≈ [S(t

)]

j=1

(1 − p

)

. (3.8)

Note the detailed derivation of 3.8 is, for a single j

term of the

summation, we have

∂S

∂q

i−1

j=1

and the variance for this speciﬁc term is

(

∂S

∂q

)

V ar( ˆq

) = (S(t

i−1

))

q(1−q)

= (S(t

i−1

))

(1−q)

= (S(t

))

1−q

3.5.2 Greenwood’s Formula

Substitute p

, S(t

) by ˆp

S(t

) respectively in (3.8) gives the Greenwood’s

formula.

3.5.3 Pointwise Normal-Approximation Conﬁdence Interval for

F (t

)

A normal approximate 100(1 − α)% conﬁdence interval

∼

F ] =

F (t

) ± z

(1−α/2)

ˆse

, (3.9)

and it’s based on the assumption that Z

F (t

)−F (t

)

ˆse

can be approximated

by a standard normal distribution.

When the sample size is not large and the data may be heavily skewed, a

logit transformation may be helpful for a better approximation as

log(

F (t

)) may behave more like a standard normal random variable, i.e.,

logit(

F )

logit(

F (t

)) − logit(F (t

))

ˆse

logit(

F )

. (3.10)

3.6 Estimation from Multiply Censored Data with

Exact Failures

Exact failures arise from a continuous inspection process, while the width of

intervals approaches zero, the step functions of

F (t) increases at the

reported failure times. This limiting case of interval-based non parametric

estimatro is know as the product-limit or Kaplan-Meier estimator.

4 LOACTION-SCALE-BASED PARAMETRIC DISTRIBUTIONS

3.7 Simulataneous Conﬁdence Bands

The motivation is to quantify the sampling uncertainty simutaneously over

a range of time t instead of pointwise conﬁdence intervals at particular

speciﬁed value of t.

Approximate 100(1 − α)% simultaneous conﬁdence bands for F (t) can be

obtained from

∼

(t),

F (t)] =

F (t) ± e

(a,b,1−α/2)

ˆse

F (t)

, (3.11)

where t ∈ [t

(a), t

(b)] and the approximate factor e

(a,b,1−α/2)

was computed

from a large-sample approximation given in Nair(1984).

3.8 Uncertain Censoring Times

1. If censoring times are random and the form of the distribution is

know, a likelihood estimation method could be applied.

2. If censoring pattern and distribution unknown, ˆp

= d

− r

/2) is

applied.

4 Loaction-Scale-Based Parametric

Distributions

4.1 Quantities of Interest in Reliability Applications

1. Probability of failure p = P (T ≤ t) = F (t; θ) at a speciﬁed t.

2. p quantile of the distribution of T with t

= F

−1

(p; θ).

3. hazard function with h(t) =

f(t;θ)

1−F (t;θ)

4. average/expectation/ﬁrst moment

E(T ) =

∞

tf(t; θ)dt (4.1)

= −

∞

td(1 − F (t; θ)) (4.2)

= −tF (t; θ)



∞

(1 − F (t; θ))dt (4.3)

∞

(1 − F (t; θ))dt (4.4)

4 LOACTION-SCALE-BASED PARAMETRIC DISTRIBUTIONS

is a measure of the center of f (t; θ).

One possible confusion is if we rewrite f(t; θ) in terms of the hazard

function above, the equation will be E(T ) =

∞

[1 − F (t)]th(t)dt.

However, these equations are equivalent: because 1 − F (t) = e

−H(t)

H(t) =

h(x)dx and d(e

−H(t)

) = −e

−H(t)

d(H(t)) = −e

−H(t)

h(t)dt,

E(T ) =

∞

[1 − F (t)]th(t)dt (4.5)

∞

−H(t)

h(t)tdt (4.6)

= −

∞

t d(e

−H(t)

) (4.7)

= −te

−H(t)



∞

−H(t)

dt (4.8)

∞

(1 − F (t)dt (4.9)

It’s the same.

5. variance of T with V ar(T ) =

∞

[t − ET ]

f(t; θ)dt and standard

deviation sd(T ) =

V ar(T ), is the spread of the distribution of T .

6. coeﬃcient of variation of T γ

= sd(T )/E(T ), is useful for comparing

the relative amount of variability in diﬀerent distributions. 1γ

is the

“signal to noise ratio”.

7. standardized third central moment/ coeﬃcient of skewness with

∞

[t−E(T )]

f(t;θ)dt

[V ar(T )]

4.2 Location-scale and Log-location-scale

Distributions

Location-scale family distributions can be written as the form:

P (Y ≤ y) = Φ(

y − µ

) (4.10)

While for Log-location-scale family random variable T , log(T) belongs to

the location-scale family.

4 LOACTION-SCALE-BASED PARAMETRIC DISTRIBUTIONS

4.3 Exponential Distribution

F (t; θ, γ) = 1 − e

−

t−γ

. (4.11)

Hazard function is constant.

Useful for some kinds of electronic components(for example, capacitors, or

high qualify integrated circuits).

4.4 Normal Distribution

F (y; µ, σ) = Φ

nor

(

y − µ

). (4.12)

Increasing hazard function.

4.5 Log-normal Distribution

F (y; µ, σ) = Φ

nor

(

log(y) − µ

). (4.13)

Appropriate for time to failure caused by a degradation process with

combinations of random rate constants that combine multiplicatively.

Widely used to describe the time to fracture from fatigue crack growth in

metals.

4.6 Smallest Extreme Value Distribution

F (y; µ, σ) = Φ

sev

(

y − µ

). (4.14)

where Φ

sev

= 1 − e

−e

in standardized case.

Exponentially increasing hazard function suggests that SEV is suitable for

modeling the life of a product that experiences very rapid wearout after a

certain age.

4.7 Weibull Distribution

F (T ≤ t; η, β) = 1 − e

−(

)

, t > 0 (4.15)

Relationship with SEV: If T has a Weibull distribution, then

Y = log(T ) ∼ SEV (µ, σ) with µ = log(η), σ = 1/β.

4 LOACTION-SCALE-BASED PARAMETRIC DISTRIBUTIONS

4.8 Largest Extreme Value Distribution

F (y; µ, σ) = Φ

lev

(

y − µ

). (4.16)

where Φ

lev

= e

−e

−z

in standardized case.

Less commonly used due to the positive probability of negative

observations. It can be used as a model for life if σ is small relative to

µ > 0.

4.9 Logistic Distribution

F (y; µ, σ) = Φ

logis

(

y − µ

). (4.17)

where Φ

logis

1+e

in standardized form.

The shape is similar to that of the normal distribution except longer tails,

however, the behavior of the hazard function is diﬀerent at upper tail.

Preferred to the normal distribution because its cdf can be written in a

simple closed form, if computation is a concern.

4.10 Log-logistic Distribution

F (y; µ, σ) = Φ

logis

(

log(y) − µ

). (4.18)

4.11 Generating Pseudo-random Observations from a

Speciﬁed Distribution

4.11.1 Pseudo-random Observations from Continuous

Distributions

Generate U

, . . . , U

random variables from a UNIF (0, 1), then

= F

−1

) is a pseudorandom sample from F

4.11.2 Pseudo-random Observation from Discrete Distributions

Same general idea as Section 4.11.1, but it could be more complicated as

sometimes the discrete quantiles cannot be computed directly.

5 OTHER PARAMETRIC DISTRIBUTIONS

4.11.3 Eﬃcient Generation of Censored Pseudorandom Samples

General Approach based on order statistics

(i)

denotes the i

order statistic from a random sample of size n from

UNIF (0, 1), then

P (U

(i)

≤ u|U

(i−1)

= u

(i−1)

) = 1 −

1 − u

(i−1)

(n−i+1)

, u ≥ u

(i−1)

(4.19)

By applying the method in Section 4.11.1 , generate a random variable U

to replace the P to solve the u as U

(i)

Application:

Failure-Censored Samples: generate pseudorandom sample with n units

and r failures.

1. Generate U

, . . . , U

(

)

(the probability) pseudorandom observations

from UNIF (0, 1).

2. Find the order statistics U

(1)

, . . . , U

(r)

given U

(0)

= 0. by (4.19).

(1)

= 1 − [1 − U

(0)

] × (1 − U

)

1/n

(2)

= 1 − [1 − U

(1)

] × (1 − U

)

1/(n−1)

(r)

= 1 − [1 − U

(r−1)

] × (1 − U

)

1/(n−r+1)

3. The pseudorandom sample from F (t; θ) is T

(i)

= F

−1

(i)

; θ].

Time Censored Sample:

Set t

is the cutoﬀ time.

1. Generate a new pseudorandom observation U

from UNIF (0, 1).

Compute U

(i)

= 1 − [1 − U

(i−1)

] × (1 − U

)

1/(n−i+1)

and

(i)

= F

−1

(i)

; θ].

2. If T

> t

stop; else, i = i + 1 and return to step 1.

5 Other Parametric Distributions

The coeﬃcient of variation (CV) is deﬁned as the ratio of the standard

deviation σ to the mean µ: c

. It shows the extent of variability in

relation to the mean of the population

https://en.wikipedia.org/wiki/Coefficient_of_variation

6 PROBABILITY PLOTTING

6 Probability Plotting

Probability plots use special scales on which a cdf of a particular

distribution plots as a straight line.

6.1 Purpose

1. Assess the adequacy of a particular distributional model.

2. Provide non-parametric graphical estimates of probabilities and

distribution quantiles.

3. Obtain graphical estimation of parametric model parameters by

ﬁtting a straight line through the points on a probability plot.

4. Display the results of a parametric maximum likelihood ﬁt along with

the data.

6.2 Linearing a CDF

By ﬁnd the transformation of F (t) and t such that the relationship between

the transformed variables is linear, one can obtain a linearized plot of time

versus the CDF.

6.3 Probability Plotting Positions

6.3.1 Criteria for Choosing Plotting Positions

1. Checking distributional assumptions: The choice of plotting positions,

in moderate-to-large samples, is not so important.

2. Estimation of parameters: The “best” plotting positions will depend

on the assumed underlying model and the functions to be estimated.

For complete data, let i index the ordered observations, then p

i−0.5

is a good choice for general purpose use in probability plotting.

3. Display of maximum likelihood ﬁts with data: The criteria is that the

line “ﬁt” the points well when the assumed model being ﬁt with ML

agrees with the data.

6 PROBABILITY PLOTTING

6.3.2 Choice of Plotting Positions

1. Continuous inspection data and single censoring:Let t

(i)

be the i

ordered failure time,

F (t

) = i/n be a step function increasing by 1/n

at each reported failure time, a reasonable plotting position is the

midpoint of the jump:

i − 0.5

F (t

(i)

) +

F (t

(i−1)

)

i − 1

. (6.1)

2. Continuous inspection data and multiple censoring: due to the

multiple censoring, the step increases may be diﬀerent from 1/n,

F (t

(i)

) +

F (t

(i−1)

)

. (6.2)

3. Interval censored inspection data: Failures are recorded at the upper

endpoint of each interval,

F (t

). (6.3)

4. Arbitrary censored data: With mixtures of left, right censoring, it

requires a compromise between the other two cases.

6.4 Probability Plots with Speciﬁed Shape Parameter

There are some distributions not in the location-scale families and cannot

be transformed into such as distribution. For example, the gamma and

generalized gamma distribution. They may have one or more unknown

shape parameters. Two approaches to specify an unknown shape parameter

for a probability plot:

1. Plot the data with diﬀerent given values of shape parameter to ﬁnd a

value that will give probability plot that is nearly linear.

2. Use parametric maximum likelihood methods to estimates the shape

parameter and use the estimated value to construct the probability

plotting scales.

7 PARAMETRIC LIKELIHOOD FITTING CONCEPTS: EXPONENTIAL DISTRIBUTION

7 Parametric Likelihood Fitting Concepts:

Exponential Distribution

7.1 Introduction

The appeal of maximum likelihood(ML) stems from the fact that it can be

applied to a wide variety of statistical models and kinds of data(e.g.,

continuous, discrete, categorical, censored, truncated).

Statistical theory shows that, under standard regularity conditions, ML

estimators are “optimal” in large samples(ML estimators are consistent and

asymptotically eﬃcient, i.e., among consistent competitors to ML

estimators, non has a smaller asymptotic variance).

Besides Bayesian methods, there is no general theory that suggests

alternatives to Ml that will be optimal in ﬁnite samples.

Statistical modeling, in practice, is an iterative procedure of ﬁtting

proposed models in search of a model that provides an adequate description

of the population or process of interest without being unnecessarily

complicated.

7.2 Parametric Likelihood

Similar with equation (2.6) for non-parametric models, the parametric

likelihood function can be written as a joint probability,

L(θ; DAT A) = C

i=1

(θ; data

). (7.1)

In practice, log-likelihood is more widely used with a naturally sums,

L(θ) = log[L(θ)] =

i=1

(θ). (7.2)

7.3 Conﬁdence Intervals for θ

7.3.1 Likelihood Conﬁdence Intervals for θ

R(θ) =

L(θ)

θ)

is the relative likelihood for θ, then an approximate

100(1 − α)% likelihood-based conﬁdence interval is the set of all values of θ

7 PARAMETRIC LIKELIHOOD FITTING CONCEPTS: EXPONENTIAL DISTRIBUTION

such that

− 2log(R(θ)) ≤ χ

1−α;n

(7.3)

, where n is the degree of number of parameters in θ.

If −2log(R(θ

)) > χ

1−α;n

, then the hypothesis testing θ = θ

should be

rejected.

7.3.2 Normal-Approximation Conﬁdence Intervals for θ

A 100(1 − α)% normal-approximation conﬁdence interval for θ is

[θ

∼

θ] =

θ ± z

(1−α/2)

ˆse

, (7.4)

where ˆse

−

L(θ)

dθ

−1

θ=

It’s based on the assumption that Z =

θ−θ

ˆse

can be approximated by

NOR(0, 1).

An alternative approximate conﬁdence interval for positive quantities is,

[θ

∼

θ] = [

θ/w,

θ × w], (7.5)

where w = e

(1−α/2)

bse

log(

θ) ± z

1−α/2

bse

log(

θ)

bse

log(

θ)

≈ bse

It’s based on the assumption that the distribution

log(

θ)

log(

θ) − log(θ)

bse

log(

θ)

(7.6)

can be approximated by a NOR(0, 1) distribution.

7.4 Conﬁdence Intervals for Function of θ

For one parameter distributions, conﬁdence intervals for θ can be translated

directly into conﬁdence intervals for monotone functions of θ.

For models have more than one parameter, a collection of intervals must be

handled diﬀerently because the conﬁdence level applies only to the process

of constructing an interval for a single point in time t

7 PARAMETRIC LIKELIHOOD FITTING CONCEPTS: EXPONENTIAL DISTRIBUTION

7.5 Likelihood for Exact Failure Times

7.5.1 Correct Likelihood for Observations Reported as Exact

Failures

Sometimes the failure time are reported as exact times, however, are

actually discrete. Then the “correct likelihood” can be corrected as

(θ; data

) =

i−1

f(t; θ)dθ = F (t

; θ) − F (t

i−1

; θ). (7.7)

7.5.2 Using Density Approximation for Observations Reported

as Exact Values

For most statistical models, the contribution to the likelihood of the

observations reported as exact values can, for small ∆

> 0, be

approximated by

F (t

; θ) − F (t

− ∆

; θ) ≈ f(t

; θ)∆

. (7.8)

where ∆

doesn’t depend on θ. Then the likelihood

(θ) = f(t

; θ) (7.9)

diﬀers (7.8) by a constant scale factor. As long as the approximation in

(7.8) is adequate, the general character of the likelihood is not aﬀected.

(7.9) can be used for the likelihood for an observation with “exact failure

time” at t

7.6 Data Analysis with No Failures

It’s possible that there will be no failures for data on hight-reliability

components. With zero failures and assumed exponential distribution, a

conservative 100(1 − α)% lower conﬁdence bound on θ is

∼

2 · T T T

1−α;2

T T T

−log(α)

, (7.10)

while χ

1−α;2

= −2log(α)

Solve the quantile using the chi-square distribution formula, https://en.wikipedia.

org/wiki/Chi-squared_distribution

8 MAXIMUM LIKELIHOOD FOR LOG-LOCATION-SCALE DISTRIBUTIONS

8 Maximum Likelihood for

Log-Location-Scale Distributions

8.1 Likelihood

L(µ, σ) =

i=1

[f(y

; µ, σ)]

[1 − F (y

; µ, σ)]

1−δ

, (8.1)

where

y =



1 if y

is an exact observation

0 right-censored observation.

8.2 Likelihood Conﬁdence Regions and Intervals

8.2.1 Joint Conﬁdence Regions for µ and σ

Using the large-sample χ

approximation for the distribution of the

likelihood ratio statistic, an approximate 100(1 − α)% joint conﬁdence

region for two dimensional vector,

R(θ

, θ

) > exp[−χ

1−α;2

/2]. (8.2)

8.2.2 Individual Conﬁdence Intervals for µ and σ

The proﬁle likelihood for µ is

R(µ) = max



L(µ, σ)

L(ˆµ; ˆσ)



, (8.3)

Given any µ value, one can calculated and ﬁnd the point of highest relative

likelihood by maximizing over σ.

The 100(1 − α)% approximate conﬁdence interval interval is given by

R(µ) > exp[−χ

1−α;1

/2]. (8.4)

Conﬁdence interval for σ can be derived similarly.

8.2.3 Likelihood Conﬁdence Intervals for Functions of µ and σ

Due to the invariance property of ML estimators, likelihood based methods

can be applied to make inferences about functions of parameters.

For example, for the p quantile, t

= exp[µ + Φ

−1

(p)σ],

R(t

) = max



L(t

, σ)

L(ˆµ, ˆσ)



where we substitute µ by t

9 BOOTSTRAP CONFIDENCE INTERVALS

8.3 Normal Approximation Conﬁdence Intervals

1. First compute the variance-covariance matrix. For location-scale

distribution, the local estimate

is the inverse of the

observed information matrix,

ˆµ,ˆσ

V ar

ˆµ

Cov

ˆµ,ˆσ

Cov

ˆµ,ˆσ

V ar

ˆσ

−

∂

L(µ,σ)

∂µ

−

∂

L(µ,σ)

∂µ∂σ

−

∂

L(µ,σ)

∂µ∂σ

−

∂

L(µ,σ)

∂µ

−1

. (8.5)

The partial second derivative describe the curvature of the log

likelihood at the ML estimate, More curvature in the log likelihood

surface implies a more concentrated likelihood near µ, σ, and this

implies more precision.

2. Then for parameters, bse can be derived for normal conﬁdence interval

approximation.

3. For normal approximation conﬁdence intervals for functions of

parameters, say g = g(µ, σ),one can ﬁrst apply delta method for the

variance of the function, i.e.,

bse

ˆg

V ar(ˆg) =



∂g

∂µ



V ar(ˆµ) + 2



∂g

∂µ



∂g

∂σ



Cov(ˆµ, ˆσ) +



∂g

∂σ



V ar(ˆσ)

then ﬁnd the approximate conﬁdence interval based on it.

4. The intervals can be improved slightly by using t

(

p, ν) instead of z

8.4 Estimation with Given σ

With such information available, one can provide considerably more

precision from limited data. However, the danger is the given value of σ

may be seriously incorrect, resulting in misleading conclusions.

9 Bootstrap Conﬁdence Intervals

Simulation based intervals provide another important method to obtain

exact or more accurate approximate conﬁdence intervals. It’s expected to

be more accurate than the normal approximation methods and competitive

with the likelihood-based methods.

9 BOOTSTRAP CONFIDENCE INTERVALS

9.1 Bootstrap Sampling

General Idea: A conﬁdence interval is judged on the basis of how well the

procedure would perform if it were repeated over and over again. A

conﬁdence interval should not be too wide and the coverage probability

(probability that the interval contains the quantity of interest) should be

equal or close to the nominal coverage probability 1 − α. The idea of

bootstrap sampling is to simulate the repeated sampling process and use

the information from the distribution of appropriate statistics in the

bootstrap samples to compute teh needed conﬁdence interval, reducing the

reliance on large-sample approximations.

When the goal is to compute conﬁdence intervals, the usual

recommendation is to use between B = 2000 to B = 5000 bootstrap

samples. Large B values are recommended for estimating the more extreme

quantiles of the bootstrap distribution that are required for higher

conﬁdence levels.

Sampling Methods

1. Fully “parametric” bootstrap procedure: Simulate each sample of size

n from the assumed parametric distribution,using the ML estimates

computed from the actual data to replace the unknown parameters.

The disadvantage is it requires complete speciﬁcation of the censoring

process. It’s not a problem with simple censoring such as Type I or

II , but more diﬃcult for complicated systematic or random censoring.

2. “Non-parametric” bootstrap sampling: each sample of size n is

obtained by sampling, with replacement, from the actual data cases

in the original data set.

9.2 Conﬁdence Intervals

Take exponential distribution as an example.

Monte Carlo simulation-based methods can be used to obtain a better

approximation to the distribution of Z

log(

θ)

than assuming that

log(

θ)

˙∼NOR(0, 1). Here are the steps:

1. Simulate B bootstrap samples of size n;

2. Compute the ML estimates for each bootstrap sample;

9 BOOTSTRAP CONFIDENCE INTERVALS

3. For the j-th bootstrap sample, compute Z

log(

∗

)

log(

∗

)−log(

θ)

bse

log(

∗

)

;

Intervals are based on the distribution of t-like statistics. , so the

method is also called “boostrap-t” method.

4. Intervals for θ based on the assumption that the simulation

distribution of Z

log(

∗

)

provides a good approximation of the

distribution of Z

log(

θ)

[θ

∼

θ] = [

θ/w

∼

θ × ew], (9.1)

where w

∼

= e

log(

∗

)

(1−α/2)

bse

, ew = e

log(

∗

)

(α/2)

bse

and

ˆse

−

L(θ)

dθ

−1

θ=

Note that the w here is not symmetric as Section 7.3.2 and z is based

on the quantile of the distribution of Z

log(

∗

)

5. Intervals for θ based on the assumption that the simulated

distribution of Z

∗

provides a good approximation to the distribution

of Z

[θ

∼

θ] = [

θ − z

∗

(1−α/2)

bse

θ − z

∗

(α/2)

bse

], (9.2)

where the z is the quantile of the distribution of Z

∗

• Bootstrap conﬁdence intervals for monotone functions of θ can be

obtained by doing the transformation.

• For distributions with more than one parameter, for example, the

Weibull distribution, intervals can be obtained in a similar manner,

i.e., base the bootstrap-t conﬁdence intervals on bootstrap evaluations

of the distributions of the t-like statistics, for example,

ˆµ

, Z

log(ˆσ)

, Z

logit(

F (t))

. . .

9.3 Percentile Bootstrap Method

When it’s not easy to compute the standard error for an estimate, the

percentile bootstrap described by Efron and Tibshirani(1993) provides a

simple useful alternative.

1. Suppose B simulations , and there is an estimate of θ,

∗

for

j = 1, ··· , B.

10 PLANNING LIFE TEST

2. The 100(1 − α)% percentile bootstrap interval for θ is

[θ

∼

θ] = [

∗

[l]

∗

[u]

], (9.3)

where

∗

[j]

is the bootstrap sample ordered from smallest to largest,

l = B × (α/2), u = B × (1 − α/2). And l, u would be rounded to the

next lowest and next highest integer, respectively.

The advantage is that the interval doesn’t depend the on the transformation

scale of the parameter(“transformation preerving”). It doens’t work as well

as the previous bootstrap-t method for small samples, though.

10 Planning Life Test

10.1 Introduction

Idea: When some “planning information” about the life distribution is

available, its possible to assess the eﬀect of sample size and test length on

the outcome of a particular test plan. Such planning information is

typically obtained from design speciﬁcations, expert opinion, or previous

experience with similar product or materials.

 represents a planning value of a population or process quantity.

Simulation of a Proposed Test Plan

1. Use the chosen model and planning values of the distribution to

simulate data from the proposed life test.

2. Analyze the data, perhaps ﬁtting more than one distribution.

3. Assess precision of estimates: This can be done initially by computing

approximate conﬁdence intervals, as done in the real data.

4. Simulate many samples and ﬁt distributions for each, assess the

sample-to-sample diﬀerences. Such multiple simulations provide an

assessment of estimation precision.

5. Repeat the simulation-evaluation process with diﬀerent sample sizes

to gauge the actual sample size and test length requirements to

achieve the desired precision.

6. Repeat the simulation-evaluation process with diﬀerent input

“planning values” over the range of their uncertainty.

10 PLANNING LIFE TEST

In general, simulation is a useful method of assessing variability. To control

the standard deviation of an estimator to a speciﬁed degree of precision, it’s

possible to interpolate among simulated values at diﬀerent sample sizes.

10.2 Approximate Variance of ML Estimators

10.2.1 Basic Large-Sample Approximation

For a model with θ = (θ

, . . . , θ

), the results hold approximately for large

samples:

θ follow a multivariate normal distribution with mean vector θ and

covariance matrix

2. The large sample approximate covariance matrix can be computed

from

= I

−1

with the Fisher information,

= E

−

∂

L(θ)

∂θ∂θ

i=1

−

∂

(θ)

∂θ∂θ

. (10.1)

In practical problems, interest will center on one or more scalar

functions of the parameters, sya g = g(θ). Then in large samples,

I : Assume the distribution of ˆg = g(

θ) can be approximated by a normal

distribution,

ˆg ˙∼NOR(g(θ), Ase(ˆg), (10.2)

where

Ase(ˆg) =

√

ˆg

V (ˆg) = nAvar(ˆg)

Avar(ˆg) =

∂g(θ)

∂θ

∂g(θ)

∂θ

II : If g(θ) is positive for all θ, then an alternate form is better. Assume the

distribution of log(ˆg) = log(g(

θ)) can be approximated by a normal

distribution,

log(ˆg) ˙∼NOR(log(g(θ)), Ase(log(ˆg)), (10.3)

where

Ase(log(ˆg)) =

√

log(ˆg)

V (log(ˆg)) = nAvar(log(ˆg))

Avar(log(ˆg)) =





Avar(ˆg)

11 PARAMETRIC MAXIMUM LIKELIHOOD: OTHER MODELS

10.3 Sample Size for Unrestricted Functions

Assume normal distribution of ˆg, and the actual conﬁdence

intervalhalf-width is D,

D = z

(1−α/2)

√

ˆg

(10.4)

n =

(1−α/2)



ˆg

. (10.5)

Similar way is used for sample sizes for other kinds of distributions.

11 Parametric Maximum Likelihood: Other

Models

11.1 Truncated Data and Truncated Distributions

Censored Data Truncated Data

It occurs when there is a bound

on an observation, that is, lower

bounds for observations censored

on the right, upper bounds for obs

censored on the left, and both up-

per and lower bounds for interval

censored data.

It arises when even the existence

of a potential observation would

be unknown, if its value were to

lie in a certain range.

Table 1: Diﬀerence between Censored and Truncated Data

1. Likelihood with Left Truncation:

(θ) = P(t

< T

≤ t

> τ

) =

F (t

; θ) − F (t

; θ)

1 − F (τ

; θ)

, t

> t

≥ τ

(11.1)

For an observation with exact failure time t

(θ) =

f(t

; θ)

1 − F (τ

; θ)

, t

> τ

. (11.2)

2. Likelihood with Right Truncation: similar with the left truncation.

12 PREDICTION OF FUTURE RANDOM QUANTILES

3. Likelihood with Right and Left Truncation for an interval censored

observation:

(θ) = P(t

< T

≤ t

|τ

≤ T < τ

) =

F (t

; θ) − F (t

; θ)

F (τ

; θ) − F (τ

; θ)

, τ

≤ t

< t

≤ τ

(11.3)

11.2 Summary

General guidelines for ﬁtting parametric distributions to skewed data.

1. If data is left-skewed, it’s possible to ﬁt a three-parameter Weibull

distribution and achieve a good ﬁt to the data. In many cases,

however, it will be possible to ﬁt the simpler two-parameter Weibull

or smallest extreme value distributions and get, eﬀectively, the same

results.

2. If the data is approximately symmetric, one can generally ﬁt a

three-parameter Weibull, a three-parameter lognormal model, or

two-parameter versions of the distributions and get a reasonably good

ﬁt. In many cases, however, it will be possible to ﬁt the simpler

two-parameter normal or logistic distributions and get, eﬀectively, the

same results.

3. If the data is right skewed, it’s often to ﬁt a three-parameter Weibull

or lognormal distribution to get a good ﬁt of the data.

4. The use of a threshold parameter can be viewed from two diﬀerent

directions. Sometimes it might be viewed as a physical parameter —

a time before which probability of failure is zero or a threshold

strength. In other cases, γ is one of several parameters of a curve

being ﬁt to the data.

12 Prediction of Future Random Quantiles

12.1 Probability Prediction Intervals(θ Given)

With a completely speciﬁed continuous probability distribution, an exact

100(1 −α)% “probability prediction interval” for a further observation from

F (t; θ) is (ignoring the data),

P I(1 − α) = [T

∼

T ] = [t

α/2

, t

1−α/2

], (12.1)

12 PREDICTION OF FUTURE RANDOM QUANTILES

where t

is the p quantile of F (t; θ ). The coverage probability of the

interval in (12.1) is,

P [T ∈ P I(1 − α)] = P [T

∼

≤ T ≤

T ] = P [t

α/2

≤ T ≤ t

1−α/2

] = 1 − α (12.2)

for a continuous distribution.

12.2 Statistical Prediction Interval(θ Estimated)

12.2.1 Coverage Probability Concepts

In statistical prediction, the objective is to predict the random quantity T

based on “learning” sample information(denoted by DAT A). Generally

with only sample data, there is uncertainty in the distribution parameters.

There are two kinds of coverage probabilities:

1. For ﬁxed DATA (and thus ﬁxed

θ and [T

∼

T ]), the conditional

coverage probability of a particular interval [T

∼

T ] is,

CP [P I(1−α)|

θ; θ] = P (T

∼

≤ T ≤

T |

θ; θ) = F (

T ; θ)−F (T

∼

; θ). (12.3)

This conditional probability is unknown because F (t; θ) depends on

the unknown θ.

2. From sample to sample, the conditional coverage probability is

random because [T

∼

T ] depends on

θ. The unconditional coverage

probability for the prediction interval procedure is

CP [P I(1 − α)|θ] = P (T

∼

≤ T ≤

T |θ) = E

{CP [P I(1 − α)|

θ, θ]}.

(12.4)

This is the one generally used to describe a prediction interval

procedure. In general, CP [P I(1 − α); θ ] ≈ 1 − α because of the

dependency on the unknown θ.

12.2.2 Naive Method for Computing a Statistical Prediction

Interval

A “naive” prediction interval for continuous T is obtained by substituting

the maximum likelihood estimate for θ into (12.1).

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

12.3 The (Approximate) Pivotal Method for

Prediction Intervals

12.3.1 Type II (Failure) Censoring

A life test is run until a speciﬁed number of failures (r). When T has a

log-location-scale distribution, and the data are complete or Type II

censored, the random variable Z

log(T )

is pivotal,i.e., the distribution

depends only on n and r, not θ.

The prediction interval can be obtained from,

P [ˆµ + z

log(T )

α/2

× ˆσ < log(T ) ≤ ˆµ + z

log(T )

1−α/2

× ˆσ] = 1 − α. (12.5)

Then quantiles z

log(T )

α/2

, z

log(T )

1−α/2

can be obtained from the distribution of

log(T )

, which can be obtained approximately(due only to Monte Carlo

error) by simulating B realizations of Z

log(T

∗

)

log(T

∗

)−ˆµ

∗

ˆσ

∗

with the following

steps,

1. Draw a sample of size n from a log-location-scale distribution with

parameters (ˆµ, ˆσ), censored at the r

failure.

2. Use the censored sample to compute ML estimates ˆµ

∗

and ˆσ

∗

3. Draw an additional single observation T

∗

from the log-location-scale

distribution with parameters (ˆµ, ˆσ).

4. Compute Z

log(T

∗

)

log(T

∗

)−ˆµ

∗

ˆσ

∗

5. Repeat 1 to 4 B times. Obtain the approximation for the quantiles

log(T )

α/2

and z

log(T )

1−α/2

from the empirical distribution of Z

log(T

∗

)

12.3.2 Type I Censoring

log(T )

is only approximately pivotal and quantiles of Z

log(T )

depend on

F (t

; θ), the unknown expected proportion failing by time t

, and the

sample size n. Thus the prediction interval to predict T is approximate.

13 Degradation Data, Models, and Data

Analysis

13.1 Introduction

Design of high-reliability systems generally requires that the individual

system components have extremely high reliability, even after a long

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

periods of time. With short development times, tests must be conducted

with severe time constrains. Frequently no failures occur during such tests.

A relationship between component failure and amount of degradation

makes it possible to use degradation models and data to make inferences

and predictions about failure time.

13.2 Models for Degradation Data

13.2.1 Degradation Data

Degradation Data:

1. The measurement of physical degradation as a function of time.

2. Actual physical degradation cannot be measured, but measures of

product performance degradation may be available.

Both kinds are generally referred to as “degradation data”. Modeling

performance degradation data may be useful but complicated because

performance may be aﬀected by more than one underlying degradation

process.

Advantages of degradation data:

1. Degradation data can provide considerably more reliability

information than traditional censored failure-time data.

2. Accelerated tests are commonly used obtain reliability tests

information more quickly. Direct observation of the physical

degradation process or some closely related surrogate may allow

direct modeling of the failure causing mechanism, providing more

credible and precise reliability estimates and a ﬁrmer basis for

often-needed extrapolation.

13.2.2 Degradation Leading to Failure

Most failures can be traced to an underlying degradation process.

Figure 2 on Page 40 shows diﬀerent possible shapes of the degradation

curves.

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

Figure 2: Possible Shapes for Univariate Degradation Curves

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

13.2.3 Models for Variation in Degradation and Failure Times

There is some degree of variability in all of model factors as well as in

factors that are not in the model. These factors combine to cause

variability in the degradation curves and in failure times.

1. Unit-to-Unit Variability

• Initial Conditions

• Material Properties

• Component Geometry or Dimensions

• Within Unit Variability: often there will be spatial variability in

material properties within a unit(e.g., defects).

2. Variability Due to Operating and Environmental Conditions: For

example, stress and temperature.

13.2.4 Limitations of Degradation Data

1. Physical degradation or performance degradation are natural

properties to measure for many testing process. However, degradation

measurements, often, requires destructive inspection or disruptive

measurement. In such situations, one can obtain only one single

measurement on each unit tested.

2. Also, when degradation data are contaminated with large amounts of

measurement error or when the degradation measure is not closely

related to a failure, the advantages of degradation data is

compromised. For example, when the measurement is on performance

degradation, failures may occur for physical reasons that cannot be

observed directly.

An important but diﬃcult engineering challenge of degradation analysis is

to ﬁnd variables that are closely related to failure time and develop

methods for accurately measuring these variables.

13.2.5 General Degradation Path Model

The actual degradation data of a particular unit over time is denoted by

D(t), t > 0. The observed sample degradation y

j of unit i and time t

= D

+ 

, i = 1, . . . , n, j = 1, . . . , m

, (13.1)

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

where D

= D(t

, β

, . . . , β

) is the actual path of the unit and



∼ NOR(0, σ



). Typically ,β is a vector of k unknown parameters, same

paths have k = 1, 2, 3, 4 parameters.Some β will be random from unit to

unit.

Assumptions:

1. Random β is independent of the 

deviations.

2. σ



is constant. The adequacy of this assumption can be aﬀected by

transforming D(t). If there is an autocorrelation among



, j = 1, . . . , m

, one can use a time series model for the residual

term along with appropriate estimation methods.

13.2.6 Degradation Model Parameters

We focus on making inferences about the population or process or

predictions about future units. The underlying model parameters are

= (µ

, Σ

), which is the mean and covariance matrix of random β.

13.3 Estimation of Degradation Model Parameters

The likelihood for the random-parameter degradation model in Section

13.2.5,

L(θ

, σ



|DAT A) =

i=1

∞

−∞

···

∞

−∞

j=1



nor

(ζ

)

(β

, . . . , β

; θ

)dβ

, . . . , dβ

, (13.2)

where ζ

−D(t

j,β

,...,β

)



and f

(β

, . . . , β

; θ

) is the multivariate

normal distribution density function.

Parameters needed to be estimated are (µ

, Σ

, σ



13.4 Models Relating Degradation and Failure

13.4.1 Soft Failures: Speciﬁed Degradation Level

The failure is deﬁned (in a somewhat arbitrary, but purposeful, manner) at

a speciﬁed level of degradation if the products have a gradual loss of

performance. For example, light bulb output.

A ﬁxed D

, the critical degradation level for failure, will be used.

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

13.4.2 Hard Failures: Joint Distribution of Degradation and

Failure Level

The deﬁnition of a failure is clear — the product stops working.

With hard failures, failure times will not, in general, correspond exactly

with a particular level of degradation. Instead, the level of degradation at

which failure occurs will be random from unit to unit and even over time.

This could be modeled by using a distribution to describe unit to unit

variability in D

or a joint distribution of β and stochastic behavior in D

where D

is the critical degradation level for failure.

13.5 Evaluation of F (t)

A speciﬁed model for D

and D

deﬁnes a failure-time distribution. This

distribution can be written as a function of the degradation model

parameters. Suppose that if the degradation level ﬁrst reaches D

at time t,

then the unit fails at time t. Then

P (T ≤ t) = F (t) = F (t; θ

) = P (D(t; β

, . . . , β

) ≥ D

). (13.3)

That is to say, for a ﬁxed D

, the distribution of T depends on the

distribution of the β

, . . . , β

, which in turn, depends on the basic path

parameters θ

For most practical models, especially when D(t) is nonlinear and more than

one β

, . . . , β

are random, F (t) may not be written in a closed form and

numerical evaluation methods need to be used.

13.5.1 Analytic Solution for F (t)

For some simple path models, F (t) can be expressed as a function of the

basic path parameters of a particular unit,

D(t) = β

+ β

t (13.4)

, where β

is ﬁxed and β

∼ LOGNOR(µ, σ), i.e., β

is the common initial

amount of degradation of all the test units, and β

is the random

degradation rates for diﬀerent units. Then the F (t),

F (t; β

, µ, σ) = P [D(t) ≥ D

]

= P [β

− β

]

= 1 − Φ

nor



log(

−β

) − µ



, t > 0 (13.5)

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

13.5.2 Numerical Evaluation of F (t)

When a closed form of F (t) exists, one can evaluate it by direct integration.

13.5.3 Monte Carlo Evaluation of F (t)

Monte Carlo simulation is a versatile method for evaluating F (t). The idea

is: generate a large number of random sample paths from the assumed path

model, using the proportion of path crossing D

by time t as an evaluation

of F (t).

Algorithm

1. Generate N simulated realizations of

, . . . ,

of β

, . . . , β

from a

multivariate normal distribution with mean

and covariance matrix

. N is a large number, for example, 10,000.

2. Compute N simulated failure times corresponding to the N

realizations of

, . . . ,

by solving D(t; β

, . . . , β

) = D

3. For any desired values of t, compute

F (t) ≈

Number of Simulated F irst Crossing T ime ≤ t

, (13.6)

and this is an evaluation of F (t).

The error of this approximation is evaluated by using the binomial

distribution: The standard deviation of the Monte Carlo error in F (t) is

F (t)(1 − F (t))/N.

13.6 Estimation of F (t)

1. Closed form exists: The failure time distribution F (t) by substituting

the estimates

into Equation (13.3).

2. No closed form: Algorithm in Section 13.5.2 and 13.5.3 can be used to

evaluate equation (13.3) at

13.7 Bootstrap Conﬁdence Intervals

Bias-corrected percentile bootstrap method is used for obtain

standard errors for

F (t).

1. Use the observed data from the n sample paths to compute the

estimates

and ˆσ



13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

2. Use the algorithm in Section 13.5.2 and 13.5.3(either numeric or

Monte Carlo evaluation) to compute the estimate

F (t) at desired

values of t. This the estimated F (t).

3. Generate a large number B(e.g. B = 4000) of bootstrap samples that

mimic the original sample and compute the corresponding bootstrap

estimates

∗

(t) with the following steps:

• Generate n simulated realizations of the random path

parameters β

∗

, . . . , β

∗

, i = 1, . . . , n from

• Using the same sampling scheme as in the original experiment,

compute n simulated observed paths from Equation (13.1)

∗

= D

∗

+ 

∗

, i = 1, . . . , n, j = 1, . . . , m

, (13.7)

up to the planned stopping time t

, where 

∗

is from the normal

distribution based ˆσ



in Step 1, and D

∗

= D(t

; β

∗

, . . . , β

∗

), t

is the same as the original.

• Use the n simulated paths to estimate parameters of the path

model, giving the bootstrap estimates

∗

• Use the algorithm in Section 13.5.2 and 13.5.3(either numeric or

Monte Carlo evaluation) to compute the estimate

∗

(t) at

desired values of t.

4. For each desired value of t, the bootstrap conﬁdence interval for F (t)

is calculated using the following steps with bias-corrected

percentile bootstrap:

• Sort the B estimates

∗

(t)

, . . . ,

∗

(t)

in increasing order.

• The lower and upper bounds of point-wise approximate

100(1 − α)% conﬁdence intervals for the distribution function

F (t) according to [Tibshirani(1993)]:

∼

(t),

F (t)

∗

(t)

(l)

∗

(t)

(u)

, (13.8)

where

l = B × Φ

nor



2Φ

−1

nor

(q) + Φ

−1

nor

(α/2)



u = B × Φ

nor



2Φ

−1

nor

(q) + Φ

−1

nor

(1 − α/2)



and q is the proportion of the B values of

∗

(t) that are less

than

F (t). q = 0.5 is equivalent to the percentile bootstrap.

13 DEGRADATION DATA, MODELS, AND DATA ANALYSIS

13.8 Comparison with Traditional Failure-Time

Analysis

Degradation directly models the relationship degradation and time and

takes account of the censored observation when estimating F (t), so

sometimes it can provide a more reasonable extrapolation on F (t) than the

traditional failure time analysis.

13.9 Approximate Degradation Analysis

Approximate degradation analysis provides an (approximately correct)

alternative method of analyzing degradation data.

Steps:

1. First step: separate analysis for each unit to predict the time at which

the unit will reach the critical degradation level corresponding to

failure.

• For the unit i, use the path model y

= D

+ 

and the sample

path data (t

, y

), . . . , (t

, y

) to ﬁnd the conditional ML

estimates

= (

, . . . ,

• Solve D(t;

) = D

for t and call the solution

, i.e., the

estimated failure time for unit i.

2. Second step: use all the pseudo failure times from Step 1 as a

complete sample of failure times to estimate F (t): Repeat the

procedure for each sample path to obtain all n pseudo failure times.

Do a single distribution analysis of data

, . . . ,

Potential problems:

• Less appealing when the degradation paths are nonlinear.

• It ignores the prediction error in

t and doesn’t account for the

measurement error in the observed sample paths.

• The distribution ﬁtted to the pseudo failure times will not, in general,

correspond to the distribution induced by the degradation model.

• For some applications, the sample paths may no contain enough

information to estimate all parameters at some certain time. Fitting

diﬀerent models for diﬀerent sample paths could be necessitated.

Overall, extrapolation into the tails of the failure time distribution may be

more valid with the actual crossing distribution implied by the degradation

model than with the empirically predicted failure times.

14 INTRODUCTION TO THE USE OF BAYESIAN METHODS FOR RELIABILITY DATA

14 Introduction to the Use of Bayesian

Methods for Reliability Data

14.1 Introduction

Combination of extensive past experience and physical/chemical theory can

provide prior information to form a framework for inference and decision

making. However, there are, of course, dangers involved in making strong

assumptions about knowledge of model parameters.

14.2 Using Bayes Rule to Update Prior Information

The posterior distribution of a set of parameters,

f(θ|DAT A) =

L(DATA|θ)f(θ)

L(DATA|θ)f(θ)dθ

R(θ)f (θ)

R(θ)f (θ)dθ

, (14.1)

where R(θ) =

L(θ)

θ)

is the relative likelihood.

There is usually no closed form of this posterior distribution.

14.3 Prior Information and Distributions

Two main sources of prior information:

1. Expert or other subjective opinion.

2. Past data.

14.3.1 Noninformative(diﬀuse) Prior Distributions

Non-informative pdfs that are constant over the range of the model

parameters. It’s also called “vague prior” or “diﬀuse prior”.

14.3.2 Using Past Data to Specify a Prior Distribution

Combining past data with a non-informative prior distribution gives a

posterior pdf that is proportional to the likelihood.

14 INTRODUCTION TO THE USE OF BAYESIAN METHODS FOR RELIABILITY DATA

14.3.3 Expert Opinion and Eliciting Prior Information

A general approach is to elicit information about particular quantities(or

parameters) that, from past experience(or data), can be speciﬁed

approximately independently. For example, for a high reliability integrated

circuit, a good choice would be a quantile in the lower tail of the

failure-time distribution and the lognormal shape parameter σ.

14.4 Numerical Methods for Combining Prior

Information with a Likelihood

14.4.1 Simulation-based Methods for Computing the Posterior

Distribution of θ

Using a larger number of simulated points provides a better approximation

and the number of points used is limited only by computing equipment and

time constrains.

Algorithm 14.1 Monte Carlo Simulation

1. Generate a random sample , θ

, i = 1, . . . , M , from the prior f (θ).

2. Retain the i

sample θ

, with probability R(θ

), the relative

likelihood. Do this by generating a random variable U

from uniform

distribution and retain the sample if U

≤ R(θ

14.4.2 Marginal Posterior Distributions

Inferences on individual parameters are obtained by using the marginal

posterior distribution of the parameter of interest.

f(θ

|DAT A) =

f(θ|DAT A)dθ

−j

(14.2)

Estimates and conﬁdence intervals for a scalar function of the parameters

g(θ) are obtained by using the marginal posterior pdf f[g(θ|DAT A)] and

cdf F [g(θ|DAT A)], which are approximated by the empirical pdf and cdf of

g(θ

∗

) by simulation, respectively.

14.5 Using The Posterior Distribution for Estimation

14.5.1 Bayesian Point Estimation

Bayesian inference for θ and functions of the parameters g(θ) are entirely

based on, f[θ|DAT A] and f[g(θ|DAT A)] respectively. Given that g(θ) is a

14 INTRODUCTION TO THE USE OF BAYESIAN METHODS FOR RELIABILITY DATA

scalar, a common Bayesian estimate of g(θ) is the mean of the posterior

distribution.

ˆg(θ) = E[g(θ)|DAT A] =

g(θ)f(θ|DAT A)dθ. (14.3)

This is an estimate that minimizes the square error loss. Other possible

choices to estimate g(θ) are mode of the posterior pdf(similar to the ML

estimate), median and mean. For example,

ˆg(θ) ≈

∗

i=1

g(θ

∗

is the sample mean.

14.5.2 Bayesian Interval Estimation

A 100(1 − α)% conﬁdence interval(credible interval) for a scalar function

g(θ) is any interval satisfying

∼

f[g(θ)|DAT A]dg(θ) = 1 − α. (14.4)

The lower and upper bounds can be chosen in diﬀerent ways:

1. Combine two 100(1 − α/2)% intervals: it puts equal probability in

each tail, preferred when there is more concern for being incorrect in

one direction than the other.

2. Highest posterior density(HPD): it chooses [g

∼

, bg] which consists all

values of g with f(g|DAT A) > c, where c is a constant such that

(14.4) holds. It’s similar to likelihood-based approximate conﬁdence

intervals, calibrated with a χ

quantile.

If there are more than one parameters in θ, a 100(1 − α)% conﬁdence

region will be computed:

= {g(θ)|f[g(θ)|DAT A] ≥ c}, (14.5)

where c is chosen such that

f[g(θ)|DAT A]dg(θ) = 1 − α.

14.6 Bayesian Prediction

Bayesian methods are useful for predicting future event like the failure of a

unit from a speciﬁed population or process. It can be predicted by using

the Bayesian posterior predictive distribution.

14 INTRODUCTION TO THE USE OF BAYESIAN METHODS FOR RELIABILITY DATA

14.6.1 Bayesian Posterior Predictive Distribution

If X represents a future random variable from f(x|θ), then the posterior

predictive pdf of X:

f(x|DAT A) =

f(x|θ)f(θ|DAT A)dθ = E

θ|DAT A

[f(x|θ)]. (14.6)

14.6.2 Approximating Posterior Predictive Distribution

The Bayesian posterior predictive pdf can be approximated by the average

of the posterior pdf f(x|θ

∗

f(x|DAT A) ≈

∗

i=1

f(x|θ

∗

), (14.7)

where θ

∗

is sampled using Algorithm 14.1. The CDF can be approximated

similarly by replacing f with F .

14.6.3 Posterior Predictive Distribution for the k th Failure from

a Future Sample of Size m

Assume distribution of time T has a log-location-scale distribution. Let T

(k)

denote the k

largest observation for a sample of size m from the

distribution of T . The pdf of T

(k)

conditional on θ,

f(t

(k)

|θ) =

(k − 1)!(m − k)!

×[Φ(ζ)]

k−1

σt

(k)

φ(ζ) ×[1 −Φ(ζ)]

m−k

, (14.8)

where ζ =

log(t

(k)

)−µ

The CDF of T

(k)

conditional on θ is,

P [T

(k)

≤ t

(k)

|θ] = F [t

(k)

|θ] =

j=k

j!(m − j)!

[Φ(ζ)]

× [1 − Φ(ζ)]

m−j

. (14.9)

Combining (14.7), (14.8) and (14.9) provides the pdf and cdf of T

(k)

14.7 Practical Issues in the Application of Bayesian

Methods

14.7.1 Comparison Between Bayesian and

Likelihood/Frequentist Statistical Methods

Diﬀerence: The manner in which nuisance parameters are handled.

15 SYSTEM RELIABILITY CONCEPTS AND METHODS

Bayesian Likelihood

Bayesian interval inference meth-

ods are based on a marginal dis-

tribtion in which nuisance param-

eters have been integrated out.

Nuisance parameters can be max-

imized out, as suggested by large-

sample theory.

Parameter uncertainty can be in-

terpreted in terms of probabilities

from the marginal posterior dis-

tribution.

Conﬁdence intervals based on

likelihood and proﬁle likelihood

functions can be calibrated and

interpreted in terms of repeated

sampling coverage probabilities.

In large samples where the likelihood and posterior are approx-

imately symmetric, Bayesian and likelihood conﬁdence interval

methods give very similar answers when prior information is ap-

proximately uninformative.

Table 2: Comparison of Bayesian and Likelihood Methods

14.7.2 Caution on the Use of Prior Information

Analysts and decision makers must beware of and avoid the use of “wishful

thinking” as prior information. The potential for generating seriously

misleading conclusions is especially high when experimental data will be

limited and the prior distribution will dominate in the ﬁnal answers.

When using Bayesian statistics, it’s important to do sensitivity analysis

with respect to uncertain inputs to one’s model, i.e., change the prior

distribution assumptions and check the eﬀect that the changes have on ﬁnal

answers of interest.

15 System Reliability Concepts and

Methods

15.1 Introduction

Assessing and improving system reliability generally requires consideration

of system structure and component reliability. Also, some systems are

replaced upon failure, while many are maintained and/or repaired after

failure. For repairable systems, availability may be the appropriate metric.

This leads to consideration of maintainability and repairability. In general,

availability can be increased by increasing reliability or by improving

15 SYSTEM RELIABILITY CONCEPTS AND METHODS

maintainability and repairability.

• Availability: the fraction of time taht a system is available for use.

• Maintainability: improvement of reliability through inspection and/or

preventive maintenance.

• Repairability: characterized by the distribution of time to do a repair.

15.2 System Structures and System Failure

Probability

System failure probability, F

(t; θ), is the probability that the system fails

before t. The failure probability of the system is a function of

• time in operation,

• the system structure,

• reliability of system components,

• interconnections, and interfaces.

Complicated system structures can generally be decomposed into

collections of the simpler structures, and the methods for evaluation of

system reliability can be adapted to more complicated structures.

15.2.1 Time Dependency of System Reliability

Time dependency of the survival probability is suppressed in this Chapter.

Then the cdf of a system with s components is

(θ) = g[F

(θ

), . . . , F

(θ

)].

15.2.2 System with Component in Series

For a system with s independent components, the cdf is

(t) = 1 −

i=1

(1 − F

)). (15.1)

The system hazard function is the sum of component hazard functions,

(t) =

i=1

(t), (15.2)

15 SYSTEM RELIABILITY CONCEPTS AND METHODS

which can be derived by hazard function deﬁnition in Chapter 2:

F (t) = 1 − e

−H(t)

i=1

(1 − F

) ⇒

H(t) = −log(1 − F (t)) = −

i=1

log(1 − F

) =

Importance of Part Count in Product Design:

Rule of thumb in reliability engineering:keep the part count small- keep the

number of individual components in a system to a minimum.

Eﬀect of Positive Dependency in a Two-Component System:

The assumption of independence is conservative in the sense that the actual

(t) is smaller than that predicted by the independent component model.

If the correlation is negative, the prediction will be anti-conservative. But

negative correlation is uncommon in physical systems.

15.2.3 System with Components in Parallel

For a system with s independent components, the cdf of the system is :

(t) =

i=1

. (15.3)

Eﬀect of Positive Dependency in a Two-Component

Parallel-Redundant System:

The advantage of redundancy can be degraded seriously when the failure

times of the individual components have positive dependence, as the system

reliability will be smaller if there exits a positive relationship.

15.2.4 Systems with Components in Series-Parallel

Series-Parallel System Structure with System-Level Redundancy

A r × k series-parallel system-level redundancy structure has r parallel

setes, each with k components in series. With independent components, the

cdf is

(t) =

i=1



1 −

j=1

(1 − F

)



. (15.4)

Series-Parallel System with Component-Level Redundancy A k ×r

series-parallel system with independent components has k series structures

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

and r components inin parallel.It has cdf ,

(t) = 1 −

i=1



1 −

j=1



. (15.5)

15.2.5 Bridge System Structure

Depending on if the component at the bridge position works or not, the

bridge system can be treated as series-parallel with component level

redundancy or system level redundancy, so the cdf can be computed by the

sum of two conditional cases.

15.2.6 k-Out-of-s System Structures

A system works if at least k out of s components work but not otherwise.

The cdf ,

(t) = P (T ≤ t)

= P (atleast(s − k + 1)componentsf ail)

j=s−k+1

δ∈A



i=1

(1 − F

1−δ

)



15.3 Estimating System Reliability From Component

Data

Maximum likelihood estimation for each component and then the mle of

the system can be obtained accordingly. Delta method can be used to ﬁnd

the variance-covariance matrix. Normal approximation conﬁdence intervals

or bootstrap approximate conﬁdence intervals are proper.

16 Analysis of Repairable System and

Other Recurrence Data

16.1 Introduction

16.1.1 Repairable System Reliability Data and Other

Recurrence Data

1. Generally, repair times are measured in terms of a system age or time

since some well-deﬁned speciﬁc event in the system’s history.

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

2. The stochastic model for recurrence data is called a “point-process”

model.

Repairable system data are collected to estimate or predict quantities like:

1. The distribution of times between repairs, τ

= T

− T

j−1

2. The cumulative number of repairs in the interval (0, t] as a function of

system age t.

3. The expected time between failures(MTBF).

4. The expected number of repairs in the interval (0, t] as a function of t.

5. The repair rate as a function of t.

6. Average repair cost as a function of t.

16.1.2 A Nonparametric Model for Recurrence Data

For a single system, recurrence data can be expressed as N(s, t), in system

age interval (s, t], where N(s, t) is the cumulative number of

recurrences in the system. N(t) is used to represent N(0, t) for

simplicity.

1. The model to describe a population of systems is based on the mean

cumulative function(MCF) at system age t.

2. The population MCF is deﬁned as µ(t) = E[N(t)], where the

expectation is over the variability of each system and the unit-to-unit

variability in the population.

3. ν(t) =

dE[N (t)]

dµ

, is the recurrence rate per system for the

population given that µ(t) is diﬀerentiable.

4. The method introduced next can also be used to model other

quantities accumulating in time than number of repairs, such as the

mean cumulative cost per system.

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

16.2 Non-Parametric Estimation of The MCF

16.2.1 Non-Parametric Model Assumptions

If an observed collection of n ≥ 1 systems is an entire population of interest

or a sample from a larger population of systems, then the method described

here can be used to estimate the population MCF.

1. There exits a population of cumulative functions(one for each system

in the population), from which a sample has been observed.

2. Randomness in the sample is due to the random sampling of

cumulative functions from the population.

3. The time that observation of a system is terminated doesn’t depend

on the system’s history.

Biased MCF may be caused by:

1. Units follow a staggered scheme of entry into service and the

recurrence rate ν(t) is increasing in real time due to some possible

external events aﬀecting all systems simultaneously. Result: The

newer systems that have a more stressful life will be censored earlier

and an overly optimistic estimate of the recurrence rate will be

obtained. The third assumption is not satisﬁed.

2. Non-parametric estimator doesn’t require that the sampled systems

be statistically independent.

16.2.2 Point Estimate of the MCF

A simple naive estimator of the population MCF:

The sample mean of the available N

(t) for the systems still operating at

time t.

• N

(t) denotes the cumulative number of system recurrence for system

i before time t.

• Limitation: this estimator is only appropriate if all systems are still

operating at time t.

An appropriate unbiased estimator proposed by Nelson(1988), allows for

diﬀerent lengths of observation among systems.

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

Algorithm 16.1: Computation of the MCF estimate

1. Order the unique recurrent times, t

among all n systems, where t

denotes the j

recurrent time for system i. Let m denote the number

of unique times: t

< t

< ··· < t

2. Compute d

), the total number of recurrences for system i at t

3. Let δ

) = 1 if system i is still being observed at time t

, and 0

otherwise.

4. Compute

ˆµ(t

) =

k=1

i=1

)

i=1

)

k=1

)

, (16.1)

where δ

) is the total number of system recurrences at time t

and

) is the size of the risk set at t

One may plot ˆµ(t) as a piecewise linear function for better visual

perceptions of shape.

16.2.3 Standard Errors and Non-parametric Conﬁdence

Intervals for the MCF

In Nelson(1995a), the true variance of µ(t

) for a large population of

cumulative functions is:

V ar[ˆµ(t

)] =

k=1

V ar[

d(t

)] + 2

j−1

k=1

ν=k+1

Cov[

d(t

)]

k=1

V ar[d(t

)]

)

+ 2

j−1

k=1

ν=k+1

Cov[d(t

), d(t

)])

)

, (16.2)

where

d(t

) =

)

is the average number of recurrences per system at t

Assume that d

) is a random sample from the population of d(t

), and

use moment estimation to obtain estimators of the variance and covariance

above. More details on Chapter 16.2.3 of Meeker and Escobar(1998).

Pointwise normal-approximation conﬁdence intervals for population MCF

can be computed using the general approach and estimated standard error.

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

When µ(t) is positive, intervals can be generated based on

[log(ˆµ(t)) − log(µ(t))]/ ˆse

ˆµ(t)

∼ N(0, 1).

Finite Population Correction: When the number of cumulative

functions sampled is more than 5% or 10% of the population, ﬁnite

population methods should be used for estimating standard errors. In this

case,the moment based estimation of variances and covariances should be

multiplied by a factor 1 −

)

, where N is the total number of cumulative

functions in the population of interest.

16.2.4 Adequacy of Normal-Approximation Conﬁdence Intervals

The adequacy depends on:

1. The number of sample cumulative functions at risk to failure;

2. The shape of the distribution of the cumulative function levels at the

point in time where the interval is to be constructed.

Improvements:

1. Use t

p;ν

instead of z

, especially when the number of sample systems

at risk is small(say, less than 30).

2. If the cumulative function at a point in time has a normal distribution

and if all units are still under observation, then using t

p;ν

and

[n/(n − 1)]

1/2

ˆµ

provide an exact interval for two or more systems.

16.3 Non-Parametric Comparison of Two Samples of

Recurrence Data

Suppose there are two populations/processes with mean cumulative

function µ

(t) and µ

(t), respectively. Let ∆

(t) = µ

(t) − µ

(t) represents

the diﬀerence at time t.

1. A non-parametric estimator of ∆

(t) is

∆

(t) = ˆµ

(t) − ˆµ

(t).

2. If ˆµ

(t) and ˆµ

(t) are independent, an estimate of V ar(

∆

(t)) is

V ar(

∆

(t)) =

V ar(bµ

(t)) +

V ar(bµ

(t)) .

3. Approximate 100(1 − α)% conﬁdence interval is based on

∆

∼ N(0, 1).

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

16.4 Parametric Models for Recurrence Data

The most commonly used models for recurrence data are:

1. Poisson process (homogeneous and non-homogeneous).

2. Renewal process.

3. Superimposed versions of above processes.

16.4.1 Poisson Process

An integer-valued point process on [0, ∞) is said to be a Poisson process

if it satisﬁes:

1. N(0) = 0.

2. The numbers of recurrences in disjoint time intervals are statistically

independent.

3. The process recurrence rate ν(t) is positive and

µ(a, b) = E[N(a, b)] =

ν(u)du < ∞, when 0 ≤ a < b < ∞.

It follows that N(a, b) has a Poisson distribution:

P [N(a, b) = d] =

µ(a, b)

−µ(a,b)

d = 0, 1, 2, . . . (16.3)

16.4.2 Homogeneous Poisson Process(HPP)

HPP is a Poisson process with a constant recurrence rate, ν(t) =

1. The inter-recurrence times, τ

= T

− T

j−1

, are iid, each with an

exp(θ) distribution, which is followed by

P (τ

> t) = p[N(T

j−1

, T

j−1

+ t) = 0].

2. The time T

= τ

+ ···+ τ

to the k

recurrence has a Gam(θ, k)

distribution.

16.4.3 Non-homogeneous Poisson Process(NHPP)

NHPP has a nonconstant recurrence rate ν(t). In this case, the

interrecurrence times are neither independent nor identically distributed.

The expected number of recurrences per unit time over (a, b] is:

µ(a, b)

b − a

ν(t)dt. (16.4)

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

An NHPP model is often speciﬁed in terms of recurrence rate,

ν(t) = ν(t; θ).

Two examples:

1. Power model recurrence rate, v(t; β, η) =

(

)

β−1

, β > 0, η > 0 with

µ(t; β, η) = (

)

2. Log-linear model recurrence rate, ν(t; γ

, γ

) = exp(γ

+ γ

t) with

µ(t; γ

, γ

) = [(exp)(γ

)][exp(γ

t) − 1]/γ

16.4.4 Renewal Process

A sequence of system recurrences at system ages T

, T

, ··· is a renewal

process if the iterrecurrence times τ

are iid.

The MCF for a renewal process is known as “renewal function”. HPP is a

renewal process.

Before using a renewal process, one needs to check for departures from the

model:

1. Trend and non-independence of interrecurrence times.

2. If the above assumptions are satisﬁed, one may use earlier chapters’

methods to describe the distribution of interrecurrence times.

16.4.5 Superimposed Renewal Process

The aggregation of renewals from a group of n independent renewal

processes operating simultaneously is known as a superimposed renewal

process(SRP).

1. Unless the individual renewal processes are HPP, an SRP is NOT a

renewal process.

2. When the number of systems n is large and the systems have run long

enough to eliminate transients, an SRP behaves as an HPP. (Similar

with the central limit theorem).

3. The above result on large number of systems can sometimes be used

to justify the use of the exponential distribution to model

interrecurrence times.

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

16.5 Tools for Checking Point-Process Assumptions

16.5.1 Tests for Recurrence Rate Trend

Plots:

1. For a single system, plot of cumulative number of system recurrences

vs. time is the simplest plot. Non-linearity in this plot indicates that

the interrecurrence times are not identically distributed.

2. A plot of interrecurrence times vs. system age, or a “time series plot”

allow the discovery of trends or cycles that would suggest that the

interrecurrence times are not identically distributed.

Tests:

1. Power model: “Military Handbook” test for β = 1(HPP):

MHB

= −a

j=1

log(t

end

), (16.5)

where t

is the t

recurrence time, and t

end

is the end of the

observation.

The statistic is approximately χ

(2r)

under HPP.

See http://www.itl.nist.gov/div898/handbook/apr/section2/

apr234.htm#TheMilitaryHandbook for more details.

2. Log linear model: Laplace test for trend in log-linear NHPP model,

i.e., γ

= 0,

j=1

end

− r/2

r/12

, (16.6)

has approximate standard normally distribution.

3. Both above tests can be misleading when there is no trend but the

underlying process is a renewal process other than HPP. The

Lewis-Robinson test for trend,

= Z

¯τ

, (16.7)

where ¯τ is the sample mean and S

tau

is the standard deviation of the

inter-recurrence times. The second term on the right hand side is the

reciprocal of the sample coeﬃcient of variation.

In large samples, Z

follows approximately standard normal

distribution under a renewal process. Z

is preferred as a general

test of trend in point-process data.

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

16.5.2 Test for Independent Interrecurrence Times

Consider the serial correlation in the sequence of inter-recurrence times.

Plots: inter-recurrence times vs. lagged inter-recurrence times provides a

graphical representation of serial correlation.

Serial correlation coeﬃcient:

Cov(τ

, τ

j+k

)

V ar(τ

)

. (16.8)

Sample serial correlation:

ˆρ

r−k

j=1

(τ

− ¯τ)(τ

j+k

− ¯τ)

r−k

j=1

(τ

− ¯τ)

r−k

j=1

(τ

j+k

− ¯τ)

, (16.9)

where ¯τ =

j=1

/r.

√

r − k × ˆρ

˙∼NOR(0, 1) when ρ

= 0 and r is large, and this approximate

distribution can be used to asses if ρ

is diﬀerent from zero.

16.6 Maximum Likelihood Fitting of Poisson Process

16.6.1 Poisson Process Likelihood

For ONE system in period (0, t

], we have the data as the number of

recurrences, d

, ··· , d

in non-overlapping intervals (t

, t

], ··· , (t

m−1

, t

]

with t

= 0 and t

= t

. The likelihood for the NHPP according (16.3) is,

L(θ) = P [N(t

, t

) = d

, ··· , N(t

m−1

, t

) = d

]

j=1

P [N(t

j−1

, t

) = d

]

j=1

[µ(t

j−1

, t

; θ)]

exp

−µ(t

j−1

;θ)

j=1

[µ(t

j−1

, t

; θ)]

× exp

−µ(t

;θ)

. (16.10)

When the exact reported recurrence times at t

≤ ··· ≤ t

(r =

j=1

the likelihood in terms of the density approximation is ,

L(θ) =

j=1

ν(t

; θ) × exp[−µ(0, t

; θ)]. (16.11)

16 ANALYSIS OF REPAIRABLE SYSTEM AND OTHER RECURRENCE DATA

16.6.2 Superimposed Poisson Process Likelihood

Assumption: all systems have the same ν(t) (a strong assumption). And

systems are independent.

L(θ) =

i=1

[

j=1

ν(t

; θ) × exp[−µ(0, t

; θ)]]. (16.12)

As the assumption is strong, it’s often inappropriate in practice, and

generalizations such as use of explanatory variables to account for

system-to-system diﬀerences, are possible.

16.6.3 ML Estimation for the Power NHPP and Loglinear

NHPP

Plug in the ν(t; θ ) into the likelihood function and solve for the mle and

relative likelihood.

16.6.4 Conﬁdence Intervals for Parameters and Functions of

Parameters

General ideas in Chapter 7 and 8 can be used.

16.6.5 Prediction of Future Recurrences with a Poisson Process

The expected number of recurrences in an interval [a, b) is

ν(u; θ)du and

predictions can be made by replacing θ by the maximum likelihood

estimator

θ.

16.7 Generating Pseudo-Random Realizations from

An NHPP Process

Simulation data can be used to check the adequacy of large sample

approximation and for implementing bootstrap methods.

16.7.1 General Approach

For a monotone increasing µ(t), the random variable µ(T

i−1

, T

), i = 1, ···,

are iid, each with exp(1). The idea is to generate random inter-recurrence

times and solve for random T values sequentially.

1. Generate U

, i = 1, ··· , r from a Uniform(0, 1)

17 FAILURE-TIME REGRESSION ANALYSIS

µ(T

) = −log(U

)

µ(T

) − µ(T

) = −log(U

)

µ(T

) − µ(T

r−1

) = −log(U

). (16.13)

If a realization in an interval (0, t

] is needed, then r is random and the

sequential process is stopped when T

> t

General solution:

= µ

−1



−

i=1

log(U

)



. (16.14)

Or alternatively,

= µ

−1



µ(T

j−1

) − log(U

)



, (16.15)

where T

= 0.

17 Failure-Time Regression Analysis