Chapter
Chapter 1: Basic Concepts
1.1. Background and Scope
1.1.1. What Is Statistics?
1.1.2. What Is Big Data Analytics?
1.1.3. Data Analysis Cycle
1.1.4. Some Applications in the Petroleum Geosciences
1.2. Data, Statistics, and Probability
1.2.1. Outcomes and Events
1.2.3. Conditional Probability and Bayes Rule
1.3.3. Indicator Transform
Chapter 2: Exploratory Data Analysis
2.1.1. Measures of Center
2.1.2. Measures of Spread
2.1.3. Measures of Asymmetry
2.1.4. Graphing Univariate Data
2.2.2. Correlation and Rank Correlation
2.2.3. Graphing Bivariate Data
Chapter 3: Distributions and Models Thereof
3.1. Empirical Distributions
3.2.1. Uniform Distribution
3.2.2. Triangular Distribution
3.2.3. Normal Distribution
3.2.4. Lognormal Distribution
3.2.5. Poisson Distribution
3.2.6. Exponential Distribution
3.2.7. Binomial Distribution
3.2.8. Weibull Distribution
3.3. Working With Normal and Log-Normal Distributions
3.3.1. Normal Distribution
3.3.2. Normal Score Transformation
3.3.3. Log-Normal Distribution
3.4. Fitting Distributions to Data
3.4.2. Parameter Estimation Techniques
Linear Regression Analysis
Nonlinear Least-Squares Analysis
3.5. Other Properties of Distributions and Their Evaluation
3.5.1. Central Limit Theorem and Confidence Limits
3.5.2. Bootstrap Sampling
3.5.3. Comparing Two Distributions
Testing for Difference in Mean
Testing for Difference in Distributions
Other Methods for Comparing Distributions
Chapter 4: Regression Modeling and Analysis
4.2. Simple Linear Regression
4.2.1. Formulating and Solving the Linear Regression Problem
4.2.2. Evaluating the Linear Regression Model
4.2.3. Properties of the Regression Parameters and Confidence Limits
4.2.4. Estimating Confidence Intervals for the Mean Response and Forecast
4.2.5. An Illustrative Example of Linear Regression Modeling and Analysis
4.3.1. Formulating and Solving the Multiple Regression Model
4.3.2. Evaluating the Multiple Regression Model
4.3.3. How Many Terms in the Regression Model?
4.3.4. Analysis of Variance (ANOVA) Table
4.3.5. An Illustrative Example of Multiple Regression Modeling and Analysis
4.4. Nonparametric Transformation and Regression
4.4.1. Conditional Expectation and Scatterplot Smoothers
4.4.2. Generalized Additive Models
4.4.3. Response Transformation Models: ACE Algorithm and Its Variations
4.4.4. Data Correlation via Nonparametric Transformation
4.5. Field Application for Nonparametric Regression: The Salt Creek Data Set
4.5.1. Dataset Description
4.5.2. Variable Selection
4.5.3. Optimal Transformations and Optimal Correlation
Chapter 5: Multivariate Data Analysis
5.2. Principal Component Analysis
5.2.1. Computing the Principal Components
5.2.2. An Illustrative Example of the Principal Component Analysis
5.3.1. k-Means Clustering
An Illustrative Example of k-Means Clustering
5.3.2. Hierarchical Clustering
An Illustrative Example of Hierarchical Clustering
5.3.3. Model-Based Clustering
5.4. Discriminant Analysis
An Illustrative Example of Discriminant Analysis
5.5. Field Application: The Salt Creek Data Set
5.5.1. Dataset Description
5.5.4. Data Correlation and Prediction
Chapter 6: Uncertainty Quantification
6.1.1. Deterministic Versus Probabilistic Approach
6.1.2. Elements of a Systematic Framework
6.1.3. Role of Monte Carlo Simulation
6.2. Uncertainty Characterization
6.2.1. Screening for Key Uncertain Inputs
6.2.2. Fitting Distributions to Data
6.2.3. Maximum Entropy Distribution Selection
6.2.4. Generation of Subjective Probability Distributions
6.3. Uncertainty Propagation
Correlation Control in LHS
6.3.2. Computational Considerations
6.4. Uncertainty Importance Assessment
6.4.1. Basic Concepts in Uncertainty Importance
6.4.2. Scatter Plots and Rank Correlation Analysis
6.4.3. Stepwise Regression and Partial Rank Correlation Analysis
6.4.4. Other Measures of Variable Importance
Entropy (Mutual Information) Analysis
Classification Tree Analysis
6.5. Moving Beyond Monte Carlo Simulation
6.5.1. First-Order Second-Moment Method (FOSM)
General Expressions for Mean and Variance
Error Analysis in Additive and Multiplicative Models
6.5.2. Point Estimate Method (PEM)
6.5.3. Logic Tree Analysis (LTA)
6.6. Treatment of Model Uncertainty
6.6.2. Moment-Matching Weighting Method for Geostatistical Models
6.6.3. Example Field Application
6.7. Elements of a Good Uncertainty Analysis Study
Chapter 7: Experimental Design and Response Surface Analysis
Central Composite and Box-Behnken
Comparison of Factorial Designs
Comparison of Sampling Designs
7.3. Metamodeling Techniques
7.3.2. Quadratic Model With LASSO Variable Selection
7.3.4. Radial Basis Functions
7.3.5. Metamodel Performance Evaluation Metric
7.4. An Illustration of Experimental Design and Response Surface Modeling
7.5. Field Application of Experimental Design and Response Surface Modeling
7.5.1. Problem of Interest
7.5.2. Proxy Construction and Application Strategy
Chapter 8: Data-Driven Modeling
8.1.2. Data-Driven Models-What and Why?
8.2.1. Classification and Regression Trees
8.2.3. Gradient Boosting Machine
8.2.4. Support Vector Machine
8.2.5. Artificial Neural Network
8.2.6. Model Strengths and Weaknesses
8.3. Computational Considerations
8.3.2. Automatic Tuning of Model Parameters
8.3.3. Variable Importance
8.4.1. Dataset Description
8.4.2. Predictive Model Building
8.4.3. Variable Importance and Conditional Sensitivity
8.4.4. Classification Tree Analysis
Chapter 9: Concluding Remarks
9.1. The Path We Have Taken
9.1.1. Recapitulation of Topics
9.1.2. Style and Intended Use
9.2.2. Simple Model, or Complex?
9.2.3. One Model, or Many?
9.2.4. Is Past Always Prolog?
9.2.5. To Fit, or Overfit?