Applied Regression Analysis ( Wiley Series in Probability and Statistics )

Publication series ：Wiley Series in Probability and Statistics

Author： Norman R. Draper

Publisher： John Wiley & Sons Inc‎

Publication year： 2014

E-ISBN: 9781118625620

P-ISBN(Hardback): 9780471170822

Subject： O212.1 General statistics

Language： ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Description

An outstanding introduction to the fundamentals of regression analysis-updated and expanded The methods of regression analysis are the most widely used statistical tools for discovering the relationships among variables. This classic text, with its emphasis on clear, thorough presentation of concepts and applications, offers a complete, easily accessible introduction to the fundamentals of regression analysis. Assuming only a basic knowledge of elementary statistics, Applied Regression Analysis, Third Edition focuses on the fitting and checking of both linear and nonlinear regression models, using small and large data sets, with pocket calculators or computers. This Third Edition features separate chapters on multicollinearity, generalized linear models, mixture ingredients, geometry of regression, robust regression, and resampling procedures. Extensive support materials include sets of carefully designed exercises with full or partial solutions and a series of true/false questions with answers. All data sets used in both the text and the exercises can be found on the companion disk at the back of the book. For analysts, researchers, and students in university, industrial, and government courses on regression, this text is an excellent introduction to the subject and an efficient means of learning how to use a valuable analytical tool. It will also prove an invaluable reference resource for applied scientists and statisticians.

Chapter

Cover

Title Page

Contents

Preface to the Third Edition

About the Software

Chapter 0: Basic Prerequisite Knowledge

0.1. Distributions : Normal, t, and F

0.2. Confidence Intervals (or Bands) and T-tests

0.3. Elements of Matrix Algebra

Matrix, Vector, Scalar

Special Matrices and Vectors

Orthogonality

Inverse Matrix

Obtaining an Inverse

Determinants

Common Factors

Chapter 1: Fitting a Straight Line by Least Squares

1.0. Introduction: the Need for Statistical Analysis

1.1. Straight Line Relationship Between Two Variables

1.2. Linear Regression: Fitting a Straight Line by Least Squares

Meaning of Linear Model

Least Squares Estimation

Pocket-calculator Form

Calculations for the Steam Data

Centering the Data

1.3. The Analysis of Variance

Sums of Squares

Degrees of Freedom (df)

Analysis of Variance Table

Steam Data Calculations

Skeleton Analysis of Variance Ta Ble

R2 Statistic

1.4. Confidence Intervals and Tests for ß0 and ß1

Standard Deviation of the Slope B1; Confidence Interval for ß1

Confidence Interval for ß1

Test for Ho: ß1 = ß10 Versus H1: ß1 ≠ ß10

Reject or Do Not Reject

Confidence Interval Represents a Set of Tests

Standard Deviation of the Intercept; Confidence Interval for ß0

1.5. F-test for Significance of Regression

P-values for F-statistics

F = T2

P-values for T-statistics

1.6. the Correlation Between X and Y

Correlation and Regression

Rxy and R Connections

Testing a Single Correlation

1.7. Summary of the Straight Line Fit Computations

Pocket-calculator Computations

1.8. Historical Remarks

Appendix 1 A. Steam Plant Data

Exercises

Chapter 2: Checking the Straight Line Fit

2.1. Lack of Fit and Pure Error

General Discussion of Variance and Bias

How Big Is σ2?

Genuine Repeats Are Needed

Calculation of Pure Error and Lack of Fit Mean Squares

Special Formula When Nj = 2

Split of the Residual ss

Effect of Repeat Runs on R2

Looking at the Data and Fitted Model

Pure Error in the Many Predictors Case

Adding (or Dropping) X's Can Affect Maximum R2

Approximate Repeats

Generic Pure Error Situations Illustrated Via Straight Line Fits

2.2. Testing Homogeneity of Pure Error

Bartlett's Test

Bartlett's Test Modified for Kurtosis

Levene's Test Using Means

Levene's Test Using Medians

Some Cautionary Remarks

A Second Example

2.3. Examining Residuals: the Basic Plots

How Should the Residuals Behave?

2.4. Non-normality Checks on Residuals

Normal Plot of Residuals

2.5. Checks for Time Effects, Nonconstant Variance, Need for Transformation, and Curvature

Three Questions and Answers

Comment

2.6. Other Residuals Plots

Dependencies Between Residuals

2.7. Durbin-watson Test

2.8. Reference Books for Analysis of Residuals

Appendix 2a. Normal Plots

Normal Scores

Outliers

Some General Characteristics of Normal Plots

Making Your Own Probability Paper

Appendix 2b. Minitab Instructions

Exercises

Chapter 3: Fitting Straight Lines: Special Topics

3.0. Summary and Preliminaries

Covariance of Two Linear Functions

3.1. Standard Error of Y

Intervals for Individual Observations and Means of q Observations

3.2. Inverse Regression (straight Line Case)

3.3. Some Practical Design of Experiment Implications of Regression

Experimental Strategy Decisions

An Example

Comments on Table 3.1

3.4. Straight Line Regression When Both Variables Are Subject to Error1

Practical Advice

Geometric Mean Functional Relationship

References

Exercises for Chapters 1–3

Chapter 4: Regression in Matrix Terms: Straight Line Case

Matrices

4.1. Fitting a Straight Line in Matrix Terms

Manipulating Matrices

Orthogonality

The Model in Matrix Form

Setup for a Quadratic Model

Transpose

Inverse of a Matrix

Inverses of Small Matrices

Matrix Symmetry for Square Matrices

Diagonal Matrices

Inverting Partitioned Matrices with Blocks of Zeros

Less Obvious Partitioning

Back to the Straight Line Case

Solving the Normal Equations

A Small Sermon on Rounding Errors

Section Summary

4.2. Singularity: What Happens in Regression to Make X'x Singular? an Example

Singularity in the General Linear Regression Context

4.3. The Analysis of Variance in Matrix Terms

4.4. The Variances and Covariance of B0 and B1 from the Matrix Calculation

Correlation Between B0 and B1

4.5. Variance of Y Using the Matrix Development

4.6. Summary of Matrix Approach to Fitting a Straight Line (nonsingular Case)

4.7. The General Regression Situation

Exercises for Chapter 4

Chapter 5: the General Regression Situation

5.1. General Linear Regression

A Justification for Using Least Squares

5.2. Least Squares Properties

The R2 Statistic

R2 Can Be Deceptive

Adjusted R2 Statistic

5.3. Least Squares Properties When E ~ N(0, 1σ2)

Just Significant Regressions May Not Predict Well

The Distribution of R2

Properties, Continued

Bonferroni Limits

5.4. Confidence Intervals Versus Regions

Moral

5.5. More on Confidence Intervals Versus Regions

When F-test and T-tests Conflict

References

Appendix 5a. Selected Useful Matrix Results

Exercises

Chapter 6: Extra Sums of Squares and Tests for Several Parameters Being Zero

6.1. The "extra Sum of Squares" Principle

Polynomial Models

Other Points

Two Alternative Forms of the Extra Ss

Sequential Sums of Squares

Special Problems with Polynomial Models

Partial Sums of Squares

When T = F1/2

6.2. Two Predictor Variables: Example

How Useful Is the Fitted Equation?

What Has Been Accomplished by the Addition of a Second Predictor Variable (namely, X6)?

The Standard Error S

Extra Ss F-test Criterion

Standard Error of bi

Correlations Between Parameter Estimates

Confidence Limits for the True Mean Value of Y, Given a Specific Set of Xs

Confidence Limits for the Mean of 9 Observations Given a Specific Set of X's

Examining the Residuals

6.3. Sum of Squares of a Set of Linear Functions of Y's

Appendix 6a. Orthogonal Columns in the X Matrix

Appendix 68. Two Predictors: Sequential Sums of Squares

References

Exercises for Chapters 5 and 6

Chapter 7: Serial Correlation in the Residuals and the Durbin–watson Test

7.1. Serial Correlation in Residuals

7.2. The Durbin–watson Test for a Certain Type of Serial Correlation

Primary Test, Tables of Dl and Du

A Simplified Test

Width of the Primary Test Inconclusive Region

Mean Square Successive Difference

7.3. Examining Runs in the Time Sequence Plot of Residuals: Runs Test

Runs

Tables for Modest n1 and n2

Larger n1 and n2 Values

Comments

References

Exercises for Chapter 7

Chapter 8: More on Checking Fitted Models

8.1. The Hat Matrix H and the Various Types of Residuals

Variance-covariance Matrix of e

Other Facts About H

Internally Studentized Residuals1

Extra Sum of Squares Attributable to ej

Externally Studentized Residuals2

Other Comments

8.2. Added Variable Plot and Partial Residuals

Added Variable Plot

Partial Residuals

8.3. Detection of Influential Observations: Cook's Statistics

Higher-order Cook's Statistics

Another Worked Example

Plots

8.4. Other Statistics Measuring Influence

The Dffits Statistics

Atkinson's Modified Cook's Statistics

8.5. Reference Books for Analysis of Residuals

Exercises for Chapter 8

Chapter 9: Multiple Regression: Special Topics

9.1. Testing a General Linear Hypothesis

Testing a General Linear Hypothesis Cß = 0

9.2. Generalized Least Squares and Weighted Least Squares

Generalized Least Squares Residuals

General Comments

Application to Serially Correlated Data

9.3. an Example of Weighted Least Squares

9.4 a Numerical Example of Weighted Least Squares

9.5 Restricted Least Squares

9.6. Inverse Regression (multiple Predictor Case)

9.7. Planar Regression When All the Variables Are Subject to Error

Appendix 9a. Lagrange's Undetermined Multipliers

Notation

Basic Method

Is the Solution a Maximum or Minimum?

Exercises for Chapter 9

Chapter 10: Bias in Regression Estimates, and Expected Values of Mean Squares and Sums of Squares

10.1. Bias in Regression Estimates

10.2. The Effect of Bias on the Least Squares Analysis of Variance

10.3. Finding the Expected Values of Mean Squares

10.4. Expected Value of Extra Sum of Squares

Exercises for Chapter 10

Chapter 11: on Worthwhile Regressions, Big F’s, and R2

11.1. Is My Regression a Useful One?

An Alternative and Simpler Check

Proof of (11.1.3)

Comment

11.2. a Conversation About R2

What Should One Do for Linear Regression?

References

Appendix 11a. How Significant Should My Regression Be?

The γm Criterion

Exercises for Chapter 11

Chapter 12: Models Containing Functions of the Predictors, Including Polynomial Models

12.1. More Complicated Model Functions

Polynomial Models of Various Orders in the Xj

First-order Models

Second-order Models

Third-order Models

Transformations

Models Involving Transformations Other Than Integer Powers

Plots Can Be Useful

Using Ratios as Responses And/or Predictors

12.2. Worked Examples of Second-order Surface Fitting for K = 3 and K = 2 Predictor Variables

Do We Need X2?

Treatment of Pure Error When Factors Are Dropped

Treatment of Pure Error When a Design Is Blocked

On Dropping Terms

12.3. Retaining Terms in Polynomial Models

Example 1. Quadratic Equation in X

Criterion 1. the Origin Shift Criterion

Example 2. Second-order Polynomial in Two X's

Rule 1

Example 2. Continued

Example 3. Third-order Polynomial in Three Factors

Example 4

Criterion 2. the Axes Rotation Criterion

Example 5. Second-order Polynomial in Two X's

Rule 2

Application of Rules 1 and 2 Together

"do We Need This X?"

Summary Advice

Using a Selection Procedure for a Polynomial Fit

References

Exercises for Chapter 12

Chapter 13: Transformation of the Response Variable

13.1. Introduction and Preliminary Remarks

Simplifying Models Via Transformation

Thinking About the Error Structure

Predictions in Y-space

Preliminary Remarks on the Power Family of Transformations

Points to Keep in Mind

13.2 Power Family of Transformations on the Response: Box-cox Method

Maximum Likelihood Method of Estimating .

Some Conversations on How to Proceed

Approximate Confidence Interval for .

The Confidence Statement Has Several Forms

Importance of Checking Residuals

13.3. a Second Method for Estimating a

Advantages of the Likelihood Method

13.4. Response Transformations: Other Interesting and Sometimes Useful Plots

13.5. Other Types of Response Transformations

A Two-parameter Family of Response Transformations

A Modulus Family of Response Transformations

Transforming Both Sides of the Model

A Power Family for Proportions

13.6. Response Transformations Chosen to Stabilize Variance

Estimation of K in Table 13.11

Transformations for Responses That Are Proportions

Transformations for Responses That Are Poisson Counts

References

Exercises for Chapter 13

Chapter 14: “dummy” Variables

What Are “dummy” Variables?

An Infinite Number of Choices

14.1. Dummy Variables to Separate Blocks of Data with Different Intercepts, Same Model

Other Possibilities

How Many Dummies?

Three Categories, Three Dummies

R Categories, R Dummies

An Alternative Analysis of Variance Sequence

Will My Selected Dummy Setup Work?

Other Verification Methods

14.2. Interaction Terms Involving Dummy Variables

Two Sets of Data, Straight Line Models

Hierarchical Models

Three Sets of Data, Straight Line Models

Two Sets of Data: Quadratic Model

General Case: R Sets, Linear Model

14.3. Dummy Variables for Segmented Models

One Segment

Two Segments

Case 1: When It Is Known Which Points Lie on Which Segments

Straight Line and Quadratic Curve

Case 2: When It Is Not Known Which Points Lie on Which Segments

Remark

Exercises for Chapter 14

Chapter 15: Selecting the “best” Regression Equation

15.0. Introduction

Some Cautionary Remarks on the Use of Unplanned Data

Example Data

Comments

15.1. All Possible Regressions and “best Subset” Regression

Use of the R2 Statistic

Use of the Residual Mean Square, S2

Use of the Mallows Cp Statistic

Example of Use of the Cp Statistic

Remark

“best Subset” Regression

15.2. Stepwise Regression

Stepwise Regression on the Hald Data

Remarks

Forward Selection Result

Minitab Version of Stepwise Regression

15.3. Backward Elimination

Remarks

15.4. Significance Levels for Selection Procedures

Selecting Significance Levels in Stepwise Regression

A Drawback to Understand but Not Be Overly Concerned About

15.5. Variations and Summary

Variations on the Previous Methods

Summary

15.6. Selection Procedures Applied to the Steam Data

Remarks

Appendix 15a. Halo Data, Correlation Matrix, and All 15 Possible Regressions

Exercises for Chapter 15

Chapter 16: Ill-conditioning in Regression Data

16.1. Introduction

A Simple Example

Demonstrating Dependence in X Via Regression

16.2. Centering Regression Data

Centering

Singularity and Centering

16.3. Centering and Scaling Regression Data

Centering and Scaling and Singularity

16.4. Measuring Multicollinearity

Recommendations on Suggestions 1–6

What Are the Relationships?

16.5. Belsley's Suggestion for Detecting Multicollinearity

Comments

How Large Is a “large” Condition Index?

Appendix 16a. Transforming X Matrices to Obtain Orthogonal Columns

Example

Exercises for Chapter 16

Chapter 17: Ridge Regression

17.1. Introduction

17.2. Basic Form of Ridge Regression

17.3. Ridge Regression of the Hald Data

Automatic Choice of θ*

Possible Use of Ridge Regression as a Selection Procedure; Other θ*

17.4. In What Circumstances Is Ridge Regression Absolutely the Correct Way to Proceed?

Comments

Canonical Form of Ridge Regression

17.5. The Phoney Data Viewpoint

17.6. Concluding Remarks

Ridge Regression Simulations—a Caution

Summary

Opinion

17.7. References

Appendix 17a. Ridge Estimates in Terms of Least Squares Estimates

Appendix 17b. Mean Square Error Argument

Appendix 17c. Canonical Form of Ridge Regression

Residual Sum of Squares

Mean Square Error

Some Alternative Formulas

Exercises for Chapter 17

Chapter 18: Generalized Linear Models (glim)

18.1. Introduction

Acronym

18.2. The Exponential Family of Distributions

Some Definitions

Some Members of the Exponential Family

Expected Value and Variance of a(u)

Joint Probability Density of a Sample

18.3. Fitting Generalized Linear Models (glim)

Example: Binomial Distributions, Indices Ni, Parameters Pi

Estimation Via Maximum Likelihood

Deviance

18.4. Performing the Calculations: an Example

18.5. Further Reading

Exercise for Chapter 18

Chapter 19: Mixture Ingredients as Predictor Variables

19.1 Mixture Experiments: Experimental Spaces

Two Ingredients

Three Ingredients

Four Ingredients

Five or More Ingredients

19.2. Models for Mixture Experiments

Three Ingredients, First-order Model

Canonical Form

Q Ingredients, First-order Model

Example: the Hald Data

Test for Overall Regression

Three Ingredients, Second-order Model

Q Ingredients, Second-order Model

Example: the Hald Data

Third-order Canonical Model Form

Scheffé Designs

19.3. Mixture Experiments in Restricted Regions

19.4. Example 1

19.5. Example 2

References

Appendix 19a. Transforming Q Mixture Variables to q - 1 Working Variables

General q

Exercises for Chapter 19

Chapter 20 the Geometry of Least Squares

20.1. The Basic Geometry

20.2. Pythagoras and Analysis of Variance

Further Split-up of a Regression Sum of Squares

An Orthogonal Breakup of the Normal Equations

Orthogonalizing the Vectors of X in General

20.3. Analysis of Variance and F-test for Overall Regression

20.4. The Singular X'x Case: an Example

Example

20.5 Orthogonalizing in the General Regression Case

20.6. Range Space and Null Space of a Matrix M

Projection Matrices

20.7. the Algebra and Geometry of Pure Error

The Geometry of Pure Error

Appendix 20a. Generalized Inverses M-

Moore–penrose Inverse

Getting a Generalized Inverse

A Method for Getting M-

Example

What Should One Do?

Exercises for Chapter 20

Chapter 21: More Geometry of Least Squares

21.1. The Geometry of a Null Hypothesis: a Simple Example

21.2. General Case H0: Aß = C: the Projection Algebra

Properties

21.3 Geometric Illustrations

21.4. The F-test for H0, Geometrically

21.5. The Geometry of R2

21.6. Change in R2 for Models Nested Via Aß = 0, Not Involving ß0

21.7. Multiple Regression with Two Predictor Variables as a Sequence of Straight Line Regressions

Geometrical Interpretation

Exercises for Chapter 21

Chapter 22: Orthogonal Polynomialsand Summary Data

22.1. Introduction

22.2. Orthogonal Polynomials

Orthogonal Polynomials for N = 3, ... , 12

22.3. Regression Analysis of Summary Data

Exercises for Chapter 22

Chapter 23: Multiple Regression Applied to Analysis of Variance Problems

23.1. Introduction

Fixed Effects, Variable Effects

23.2. the One-way Classification: Standard Analysis and an Example

23.3. Regression Treatment of the One-way Classification Example

A Caution

Relationship to the Underlying Geometry

23.4. Regression Treatment of the One-way Classification Using the Original Model

23.5. Regression Treatment of the One-way Classification: Independent Normal Equations

23.6. the Two-way Classification with Equal Numbers of Observations in the Cells: an Example

23.7. Regression Treatment of the Two-way Classification Example

23.8. the Two-way Classification with Equal Numbers of Observations in the Cells

23.9. Regression Treatment of the Two-way Classification with Equal Numbers of Observations in the Cells

An Alternative Method

23.10. Example: the Two-way Classification

23.11. Recapitulation and Comments

Exercises for Chapter 23

Chapter 24: an Introduction to Nonlinear Estimation

24.1. Least Squares for Nonlinear Models

Introduction

Nonlinear Models

Least Squares in the Nonlinear Case

24.2. Estimating the Parameters of a Nonlinear System

A Geometrical Interpretation of Linearization

Steepest Descent

Marquardt's Compromise

Confidence Contours

Grids and Plots

The Importance of Good Starting Values

Getting Initial Estimates .0

24.3. an Example

A Solution Through the Normal Equations

A Solution Through the Linearization Technique

Further Analysis

Confidence Regions

Some Typical Nonlinear Program Output Features

Curvature Measures

Overparameterization

24.4. A Note on Reparameterization of the Model

24.5. The Geometry of Linear Least Squares

The Sample Space

The Sample Space When N = 3, P = 2

The Sample Space Geometry When the Model Is Wrong

Geometrical Interpretation of Pure Error

The Parameter Space

The Parameter Space When P = 2

24.6. the Geometry of Nonlinear Least Squares

The Sample Space

The Parameter Space

Confidence Contours in the Nonlinear Case

Measuring Nonlinearity

24.7. Nonlinear Growth Models

Types of Models

An Example of a Mechanistic Growth Model

Querying the Least Squares Assumptions

The Logistic Model

Another Form of the Logistic Model

How Do We Get the Initial Parameter Estimates?

The Gompertz Model

Von Bertalanffy's Model

24.8. Nonlinear Models: Other Work

Design of Experiments in the Nonlinear Case

A Useful Model-building Technique

Multiple Responses

24.9. References

Exercises for Chapter 24

Chapter 25: Robust Regression

Why Use Robust Regression?

25.1. Least Absolute Deviations Regression (l1 Regression)

25.2. M-estimators

The M-estimation Procedure

25.3. Steel Employment Example

Adjusting the First Observation

Other Analyses Possible

Standard Errors of Estimated Coefficients

25.4. Trees Example

25.5. Least Median of Squares (lms) Regression

25.6. Robust Regression with Ranked Residuals (rreg)

Other Weights

25.7. Other Methods

25.8. Comments and Opinions

25.9. References

Books

Articles

Exercises for Chapter 25

Chapter 26: Resampling Procedures (bootstrapping)

26.1. Resampling Procedures for Regression Models

26.2. Example: Straight Line Fit

Using the Original Data

26.3. Example: Planar Fit, Three Predictors

26.4 Reference Books

Appendix 26a. Sample Minitab Programs to Bootstrap Residuals for a Specific Example

Appendix 26b. Sample Minitab Programs to Bootstrap Pairs for a Specific Example

Additional Comments

Exercises for Chapter 26

Bibliography

True/false Questions

Answers to Exercises

Tables

Normal Distribution

Percentage Points of the T-distribution

Percentage Points of the X2-distribution

Percentage Points of the F-distribution

Index of Authors Associated with Exercises

Index

The users who browse this book also browse