Chapter
Special Matrices and Vectors
Chapter 1: Fitting a Straight Line by Least Squares
1.0. Introduction: the Need for Statistical Analysis
1.1. Straight Line Relationship Between Two Variables
1.2. Linear Regression: Fitting a Straight Line by Least Squares
Calculations for the Steam Data
1.3. The Analysis of Variance
Analysis of Variance Table
Skeleton Analysis of Variance Ta Ble
1.4. Confidence Intervals and Tests for ß0 and ß1
Standard Deviation of the Slope B1; Confidence Interval for ß1
Confidence Interval for ß1
Test for Ho: ß1 = ß10 Versus H1: ß1 ≠ ß10
Confidence Interval Represents a Set of Tests
Standard Deviation of the Intercept; Confidence Interval for ß0
1.5. F-test for Significance of Regression
P-values for F-statistics
P-values for T-statistics
1.6. the Correlation Between X and Y
Correlation and Regression
Testing a Single Correlation
1.7. Summary of the Straight Line Fit Computations
Pocket-calculator Computations
Appendix 1 A. Steam Plant Data
Chapter 2: Checking the Straight Line Fit
2.1. Lack of Fit and Pure Error
General Discussion of Variance and Bias
Genuine Repeats Are Needed
Calculation of Pure Error and Lack of Fit Mean Squares
Special Formula When Nj = 2
Effect of Repeat Runs on R2
Looking at the Data and Fitted Model
Pure Error in the Many Predictors Case
Adding (or Dropping) X's Can Affect Maximum R2
Generic Pure Error Situations Illustrated Via Straight Line Fits
2.2. Testing Homogeneity of Pure Error
Bartlett's Test Modified for Kurtosis
Levene's Test Using Means
Levene's Test Using Medians
2.3. Examining Residuals: the Basic Plots
How Should the Residuals Behave?
2.4. Non-normality Checks on Residuals
2.5. Checks for Time Effects, Nonconstant Variance, Need for Transformation, and Curvature
Three Questions and Answers
2.6. Other Residuals Plots
Dependencies Between Residuals
2.8. Reference Books for Analysis of Residuals
Appendix 2a. Normal Plots
Some General Characteristics of Normal Plots
Making Your Own Probability Paper
Appendix 2b. Minitab Instructions
Chapter 3: Fitting Straight Lines: Special Topics
3.0. Summary and Preliminaries
Covariance of Two Linear Functions
Intervals for Individual Observations and Means of q Observations
3.2. Inverse Regression (straight Line Case)
3.3. Some Practical Design of Experiment Implications of Regression
Experimental Strategy Decisions
3.4. Straight Line Regression When Both Variables Are Subject to Error1
Geometric Mean Functional Relationship
Exercises for Chapters 1–3
Chapter 4: Regression in Matrix Terms: Straight Line Case
4.1. Fitting a Straight Line in Matrix Terms
Setup for a Quadratic Model
Inverses of Small Matrices
Matrix Symmetry for Square Matrices
Inverting Partitioned Matrices with Blocks of Zeros
Less Obvious Partitioning
Back to the Straight Line Case
Solving the Normal Equations
A Small Sermon on Rounding Errors
4.2. Singularity: What Happens in Regression to Make X'x Singular? an Example
Singularity in the General Linear Regression Context
4.3. The Analysis of Variance in Matrix Terms
4.4. The Variances and Covariance of B0 and B1 from the Matrix Calculation
Correlation Between B0 and B1
4.5. Variance of Y Using the Matrix Development
4.6. Summary of Matrix Approach to Fitting a Straight Line (nonsingular Case)
4.7. The General Regression Situation
Chapter 5: the General Regression Situation
5.1. General Linear Regression
A Justification for Using Least Squares
5.2. Least Squares Properties
5.3. Least Squares Properties When E ~ N(0, 1σ2)
Just Significant Regressions May Not Predict Well
5.4. Confidence Intervals Versus Regions
5.5. More on Confidence Intervals Versus Regions
When F-test and T-tests Conflict
Appendix 5a. Selected Useful Matrix Results
Chapter 6: Extra Sums of Squares and Tests for Several Parameters Being Zero
6.1. The "extra Sum of Squares" Principle
Two Alternative Forms of the Extra Ss
Sequential Sums of Squares
Special Problems with Polynomial Models
6.2. Two Predictor Variables: Example
How Useful Is the Fitted Equation?
What Has Been Accomplished by the Addition of a Second Predictor Variable (namely, X6)?
Extra Ss F-test Criterion
Correlations Between Parameter Estimates
Confidence Limits for the True Mean Value of Y, Given a Specific Set of Xs
Confidence Limits for the Mean of 9 Observations Given a Specific Set of X's
6.3. Sum of Squares of a Set of Linear Functions of Y's
Appendix 6a. Orthogonal Columns in the X Matrix
Appendix 68. Two Predictors: Sequential Sums of Squares
Exercises for Chapters 5 and 6
Chapter 7: Serial Correlation in the Residuals and the Durbin–watson Test
7.1. Serial Correlation in Residuals
7.2. The Durbin–watson Test for a Certain Type of Serial Correlation
Primary Test, Tables of Dl and Du
Width of the Primary Test Inconclusive Region
Mean Square Successive Difference
7.3. Examining Runs in the Time Sequence Plot of Residuals: Runs Test
Tables for Modest n1 and n2
Chapter 8: More on Checking Fitted Models
8.1. The Hat Matrix H and the Various Types of Residuals
Variance-covariance Matrix of e
Internally Studentized Residuals1
Extra Sum of Squares Attributable to ej
Externally Studentized Residuals2
8.2. Added Variable Plot and Partial Residuals
8.3. Detection of Influential Observations: Cook's Statistics
Higher-order Cook's Statistics
8.4. Other Statistics Measuring Influence
Atkinson's Modified Cook's Statistics
8.5. Reference Books for Analysis of Residuals
Chapter 9: Multiple Regression: Special Topics
9.1. Testing a General Linear Hypothesis
Testing a General Linear Hypothesis Cß = 0
9.2. Generalized Least Squares and Weighted Least Squares
Generalized Least Squares Residuals
Application to Serially Correlated Data
9.3. an Example of Weighted Least Squares
9.4 a Numerical Example of Weighted Least Squares
9.5 Restricted Least Squares
9.6. Inverse Regression (multiple Predictor Case)
9.7. Planar Regression When All the Variables Are Subject to Error
Appendix 9a. Lagrange's Undetermined Multipliers
Is the Solution a Maximum or Minimum?
Chapter 10: Bias in Regression Estimates, and Expected Values of Mean Squares and Sums of Squares
10.1. Bias in Regression Estimates
10.2. The Effect of Bias on the Least Squares Analysis of Variance
10.3. Finding the Expected Values of Mean Squares
10.4. Expected Value of Extra Sum of Squares
Chapter 11: on Worthwhile Regressions, Big F’s, and R2
11.1. Is My Regression a Useful One?
An Alternative and Simpler Check
11.2. a Conversation About R2
What Should One Do for Linear Regression?
Appendix 11a. How Significant Should My Regression Be?
Chapter 12: Models Containing Functions of the Predictors, Including Polynomial Models
12.1. More Complicated Model Functions
Polynomial Models of Various Orders in the Xj
Models Involving Transformations Other Than Integer Powers
Using Ratios as Responses And/or Predictors
12.2. Worked Examples of Second-order Surface Fitting for K = 3 and K = 2 Predictor Variables
Treatment of Pure Error When Factors Are Dropped
Treatment of Pure Error When a Design Is Blocked
12.3. Retaining Terms in Polynomial Models
Example 1. Quadratic Equation in X
Criterion 1. the Origin Shift Criterion
Example 2. Second-order Polynomial in Two X's
Example 3. Third-order Polynomial in Three Factors
Criterion 2. the Axes Rotation Criterion
Example 5. Second-order Polynomial in Two X's
Application of Rules 1 and 2 Together
Using a Selection Procedure for a Polynomial Fit
Chapter 13: Transformation of the Response Variable
13.1. Introduction and Preliminary Remarks
Simplifying Models Via Transformation
Thinking About the Error Structure
Preliminary Remarks on the Power Family of Transformations
13.2 Power Family of Transformations on the Response: Box-cox Method
Maximum Likelihood Method of Estimating .
Some Conversations on How to Proceed
Approximate Confidence Interval for .
The Confidence Statement Has Several Forms
Importance of Checking Residuals
13.3. a Second Method for Estimating a
Advantages of the Likelihood Method
13.4. Response Transformations: Other Interesting and Sometimes Useful Plots
13.5. Other Types of Response Transformations
A Two-parameter Family of Response Transformations
A Modulus Family of Response Transformations
Transforming Both Sides of the Model
A Power Family for Proportions
13.6. Response Transformations Chosen to Stabilize Variance
Estimation of K in Table 13.11
Transformations for Responses That Are Proportions
Transformations for Responses That Are Poisson Counts
Chapter 14: “dummy” Variables
What Are “dummy” Variables?
An Infinite Number of Choices
14.1. Dummy Variables to Separate Blocks of Data with Different Intercepts, Same Model
Three Categories, Three Dummies
An Alternative Analysis of Variance Sequence
Will My Selected Dummy Setup Work?
Other Verification Methods
14.2. Interaction Terms Involving Dummy Variables
Two Sets of Data, Straight Line Models
Three Sets of Data, Straight Line Models
Two Sets of Data: Quadratic Model
General Case: R Sets, Linear Model
14.3. Dummy Variables for Segmented Models
Case 1: When It Is Known Which Points Lie on Which Segments
Straight Line and Quadratic Curve
Case 2: When It Is Not Known Which Points Lie on Which Segments
Chapter 15: Selecting the “best” Regression Equation
Some Cautionary Remarks on the Use of Unplanned Data
15.1. All Possible Regressions and “best Subset” Regression
Use of the Residual Mean Square, S2
Use of the Mallows Cp Statistic
Example of Use of the Cp Statistic
15.2. Stepwise Regression
Stepwise Regression on the Hald Data
Minitab Version of Stepwise Regression
15.3. Backward Elimination
15.4. Significance Levels for Selection Procedures
Selecting Significance Levels in Stepwise Regression
A Drawback to Understand but Not Be Overly Concerned About
15.5. Variations and Summary
Variations on the Previous Methods
15.6. Selection Procedures Applied to the Steam Data
Appendix 15a. Halo Data, Correlation Matrix, and All 15 Possible Regressions
Chapter 16: Ill-conditioning in Regression Data
Demonstrating Dependence in X Via Regression
16.2. Centering Regression Data
Singularity and Centering
16.3. Centering and Scaling Regression Data
Centering and Scaling and Singularity
16.4. Measuring Multicollinearity
Recommendations on Suggestions 1–6
What Are the Relationships?
16.5. Belsley's Suggestion for Detecting Multicollinearity
How Large Is a “large” Condition Index?
Appendix 16a. Transforming X Matrices to Obtain Orthogonal Columns
Chapter 17: Ridge Regression
17.2. Basic Form of Ridge Regression
17.3. Ridge Regression of the Hald Data
Possible Use of Ridge Regression as a Selection Procedure; Other θ*
17.4. In What Circumstances Is Ridge Regression Absolutely the Correct Way to Proceed?
Canonical Form of Ridge Regression
17.5. The Phoney Data Viewpoint
Ridge Regression Simulations—a Caution
Appendix 17a. Ridge Estimates in Terms of Least Squares Estimates
Appendix 17b. Mean Square Error Argument
Appendix 17c. Canonical Form of Ridge Regression
Some Alternative Formulas
Chapter 18: Generalized Linear Models (glim)
18.2. The Exponential Family of Distributions
Some Members of the Exponential Family
Expected Value and Variance of a(u)
Joint Probability Density of a Sample
18.3. Fitting Generalized Linear Models (glim)
Example: Binomial Distributions, Indices Ni, Parameters Pi
Estimation Via Maximum Likelihood
18.4. Performing the Calculations: an Example
Chapter 19: Mixture Ingredients as Predictor Variables
19.1 Mixture Experiments: Experimental Spaces
19.2. Models for Mixture Experiments
Three Ingredients, First-order Model
Q Ingredients, First-order Model
Test for Overall Regression
Three Ingredients, Second-order Model
Q Ingredients, Second-order Model
Third-order Canonical Model Form
19.3. Mixture Experiments in Restricted Regions
Appendix 19a. Transforming Q Mixture Variables to q - 1 Working Variables
Chapter 20 the Geometry of Least Squares
20.2. Pythagoras and Analysis of Variance
Further Split-up of a Regression Sum of Squares
An Orthogonal Breakup of the Normal Equations
Orthogonalizing the Vectors of X in General
20.3. Analysis of Variance and F-test for Overall Regression
20.4. The Singular X'x Case: an Example
20.5 Orthogonalizing in the General Regression Case
20.6. Range Space and Null Space of a Matrix M
20.7. the Algebra and Geometry of Pure Error
The Geometry of Pure Error
Appendix 20a. Generalized Inverses M-
Getting a Generalized Inverse
Chapter 21: More Geometry of Least Squares
21.1. The Geometry of a Null Hypothesis: a Simple Example
21.2. General Case H0: Aß = C: the Projection Algebra
21.3 Geometric Illustrations
21.4. The F-test for H0, Geometrically
21.6. Change in R2 for Models Nested Via Aß = 0, Not Involving ß0
21.7. Multiple Regression with Two Predictor Variables as a Sequence of Straight Line Regressions
Geometrical Interpretation
Chapter 22: Orthogonal Polynomialsand Summary Data
22.2. Orthogonal Polynomials
Orthogonal Polynomials for N = 3, ... , 12
22.3. Regression Analysis of Summary Data
Chapter 23: Multiple Regression Applied to Analysis of Variance Problems
Fixed Effects, Variable Effects
23.2. the One-way Classification: Standard Analysis and an Example
23.3. Regression Treatment of the One-way Classification Example
Relationship to the Underlying Geometry
23.4. Regression Treatment of the One-way Classification Using the Original Model
23.5. Regression Treatment of the One-way Classification: Independent Normal Equations
23.6. the Two-way Classification with Equal Numbers of Observations in the Cells: an Example
23.7. Regression Treatment of the Two-way Classification Example
23.8. the Two-way Classification with Equal Numbers of Observations in the Cells
23.9. Regression Treatment of the Two-way Classification with Equal Numbers of Observations in the Cells
23.10. Example: the Two-way Classification
23.11. Recapitulation and Comments
Chapter 24: an Introduction to Nonlinear Estimation
24.1. Least Squares for Nonlinear Models
Least Squares in the Nonlinear Case
24.2. Estimating the Parameters of a Nonlinear System
A Geometrical Interpretation of Linearization
The Importance of Good Starting Values
Getting Initial Estimates .0
A Solution Through the Normal Equations
A Solution Through the Linearization Technique
Some Typical Nonlinear Program Output Features
24.4. A Note on Reparameterization of the Model
24.5. The Geometry of Linear Least Squares
The Sample Space When N = 3, P = 2
The Sample Space Geometry When the Model Is Wrong
Geometrical Interpretation of Pure Error
The Parameter Space When P = 2
24.6. the Geometry of Nonlinear Least Squares
Confidence Contours in the Nonlinear Case
24.7. Nonlinear Growth Models
An Example of a Mechanistic Growth Model
Querying the Least Squares Assumptions
Another Form of the Logistic Model
How Do We Get the Initial Parameter Estimates?
24.8. Nonlinear Models: Other Work
Design of Experiments in the Nonlinear Case
A Useful Model-building Technique
Chapter 25: Robust Regression
Why Use Robust Regression?
25.1. Least Absolute Deviations Regression (l1 Regression)
The M-estimation Procedure
25.3. Steel Employment Example
Adjusting the First Observation
Standard Errors of Estimated Coefficients
25.5. Least Median of Squares (lms) Regression
25.6. Robust Regression with Ranked Residuals (rreg)
25.8. Comments and Opinions
Chapter 26: Resampling Procedures (bootstrapping)
26.1. Resampling Procedures for Regression Models
26.2. Example: Straight Line Fit
26.3. Example: Planar Fit, Three Predictors
Appendix 26a. Sample Minitab Programs to Bootstrap Residuals for a Specific Example
Appendix 26b. Sample Minitab Programs to Bootstrap Pairs for a Specific Example
Percentage Points of the T-distribution
Percentage Points of the X2-distribution
Percentage Points of the F-distribution
Index of Authors Associated with Exercises