Model Selection and Model Averaging ( Cambridge Series in Statistical and Probabilistic Mathematics )

Publication series ：Cambridge Series in Statistical and Probabilistic Mathematics

Author： Gerda Claeskens; Nils Lid Hjort

Publisher： Cambridge University Press‎

Publication year： 2008

E-ISBN: 9780511421235

P-ISBN(Paperback): 9780521852258

Subject： O212 Statistics

Keyword：概率论与数理统计

Language： ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Model Selection and Model Averaging

Description

Given a data set, you can fit thousands of models at the push of a button, but how do you choose the best? With so many candidate models, overfitting is a real danger. Is the monkey who typed Hamlet actually a good writer? Choosing a model is central to all statistical work with data. We have seen rapid advances in model fitting and in the theoretical understanding of model selection, yet this book is the first to synthesize research and practice from this active field. Model choice criteria are explained, discussed and compared, including the AIC, BIC, DIC and FIC. The uncertainties involved with model selection are tackled, with discussions of frequentist and Bayesian methods; model averaging schemes are presented. Real-data examples are complemented by derivations providing deeper insight into the methodology, and instructive exercises build familiarity with the methods. The companion website features Data sets and R code.

Chapter

1 Model selection: data examples and introduction

1.1 Introduction

1.2 Egyptian skull development

1.3 Who wrote ‘The Quiet Don’?

1.4 Survival data on primary biliary cirrhosis

1.5 Low birthweight data

1.6 Football match prediction

1.7 Speedskating

1.8 Preview of the following chapters

1.9 Notes on the literature

2 Akaike’s information criterion

2.1 Information criteria for balancing fit with complexity

2.2 Maximum likelihood and the Kullback–Leibler distance

2.3 AIC and the Kullback–Leibler distance

2.4 Examples and illustrations

2.5 Takeuchi’s model-robust information criterion

2.6 Corrected AIC for linear regression and autoregressive time series

2.7 AIC, corrected AIC and bootstrap-AIC for generalised linear models

2.8 Behaviour of AIC for moderately misspecified models

2.9 Cross-validation

2.10 Outlier-robust methods

2.11 Notes on the literature

Exercises

3 The Bayesian information criterion

3.1 Examples and illustrations of the BIC

3.2 Derivation of the BIC

3.2.1 Posterior probability of a model

3.2.2 BIC, BIC., BICexact

3.2.3 A robust version of the BIC using M-estimators

3.3 Who wrote ‘The Quiet Don’?

3.4 The BIC and AIC for hazard regression models

3.5 The deviance information criterion

3.6 Minimum description length

3.7 Notes on the literature

Exercises

4 A comparison of some selection methods

4.1 Comparing selectors: consistency, efficiency and parsimony

4.2 Prototype example: choosing between two normal models

4.3 Strong consistency and the Hannan–Quinn criterion

4.4 Mallows’s Cp and its outlier-robust versions

4.5 Efficiency of a criterion

4.6 Efficient order selection in an autoregressive process and the FPE

4.7 Efficient selection of regression variables

4.8 Rates of convergence

4.9 Taking the best of both worlds?

4.10 Notes on the literature

Exercises

5 Bigger is not always better

5.1 Some concrete examples

5.2 Large-sample framework for the problem

5.2.1 A fixed true model

5.2.2 Asymptotic distributions under local misspecification

5.2.3 Generalisation to regression models

5.3 A precise tolerance limit

5.4 Tolerance regions around parametric models

5.5 Computing tolerance thresholds and radii

5.6 How the 5000-m time influences the 10,000-m time

5.7 Large-sample calculus for AIC

5.8 Notes on the literature

Exercises

6 The focussed information criterion

6.1 Estimators and notation in submodels

6.2 The focussed information criterion, FIC

6.3 Limit distributions and mean squared errors in submodels

6.4 A bias-modified FIC

6.5 Calculation of the FIC

6.6 Illustrations and applications

6.6.1 FIC in logistic regression models

6.6.2 FIC in the normal linear regression model

6.6.3 FIC in a skewed regression model

6.6.4 FIC for football prediction

6.6.5 FIC for speedskating prediction

6.6.6 FIC in generalised linear models

6.7 Exact mean squared error calculations for linear regression

6.8 The FIC for Cox proportional hazard regression models

6.9 Average-FIC

6.10 A Bayesian focussed information criterion

6.11 Notes on the literature

Exercises

7 Frequentist and Bayesian model averaging

7.1 Estimators-post-selection

7.2 Smooth AIC, smooth BIC and smooth FIC weights

7.3 Distribution of model average estimators

7.4 What goes wrong when we ignore model selection?

7.4.1 The degree of over-optimism

7.4.2 The inflated type I error

7.5 Better confidence intervals

7.5.1 Correcting the standard error

7.5.2 Correcting the bias using wide variance

7.5.3 Simulation from the A distribution

7.5.4 A two-stage confidence procedure

7.6 Shrinkage, ridge estimation and thresholding

7.6.1 Shrinkage and ridge regression

7.6.2 Thresholding in wavelet smoothing

7.7 Bayesian model averaging

7.8 A frequentist view of Bayesian model averaging

7.9 Bayesian model selection with canonical normal priors

7.10 Notes on the literature

Exercises

8 Lack-of-fit and goodness-of-fit tests

8.1 The principle of order selection

8.2 Asymptotic distribution of the order selection test

8.3 The probability of overfitting

8.4 Score-based tests

8.5 Two or more covariates

8.6 Neyman’s smooth tests and generalisations

8.6.1 The original Neyman smooth test

8.6.2 Data-driven Neyman smooth tests

8.7 A comparison between AIC and the BIC for model testing

8.8 Goodness-of-fit monitoring processes for regression models

8.9 Notes on the literature

Exercises

9 Model selection and averaging schemes in action

9.1 AIC and BIC selection for Egyptian skull development data

9.2 Low birthweight data: FIC plots and FIC selection per stratum

9.3 Survival data on PBC: FIC plots and FIC selection

9.4 Speedskating data: averaging over covariance structure models

Exercises

10 Further topics

10.1 Model selection in mixed models

10.1.1 AIC for linear mixed models

10.1.2 REML versus ML

10.1.3 Consistent model selection criteria

10.2 Boundary parameters

10.2.1 Maximum likelihood theory with a boundary parameter

10.2.2 Maximum likelihood theory with several boundary parameters

10.3 Finite-sample corrections

10.4 Model selection with missing data

10.5 When p and q grow with n

10.6 Notes on the literature

Overview of data examples

Egyptian skulls

The (not so) Quiet Don

Survival with primary biliary cirrhosis

Low birthweight data

Football matches

Speedskating

Mortality in ancient Egypt

Exponential decay of beer froth

Blood groups A, B, AB, O

Health Assessment Questionnaires

The Raven

Danish melanoma data

Survival for oropharynx carcinoma

Fifty years survival since graduation

Onset of menarche

Australian Institute of Sports data

CH4 concentrations

Low-iron rat teratology data

POPS data

Birds on islands

References

Author index

Subject index

The users who browse this book also browse