Python: Advanced Predictive Analytics

Author： Joseph J Ashish Kumar

Publisher： Packt Publishing‎

Publication year： 2017

E-ISBN: 9781788993036

P-ISBN(Paperback): 89543100123820

Subject： TP312 程序语言、算法语言

Language： ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Python: Advanced Predictive Analytics

Chapter

Credits

Preface

TOC

Module 1: Learning Predictive Analytics with Python

Chapter 1: Getting Started with Predictive Modelling

Introducing predictive modelling

Applications and examples of predictive modelling

Python and its packages – download and installation

Python and its packages for predictive modelling

IDEs for Python

Summary

Chapter 2: Data Cleaning

Reading the data – variations and examples

Various methods of importing data in Python

Basics – summary, dimensions, and structure

Handling missing values

Creating dummy variables

Visualizing a dataset by basic plotting

Summary

Chapter 3: Data Wrangling

Subsetting a dataset

Generating random numbers and their usage

Grouping the data – aggregation, filtering, and transformation

Random sampling – splitting a dataset in training and testing datasets

Concatenating and appending data

Merging/joining datasets

Summary

Chapter 4: Statistical Concepts for Predictive Modelling

Random sampling and the central limit theorem

Hypothesis testing

Chi-square tests

Correlation

Summary

Chapter 5: Linear Regression with Python

Understanding the maths behind linear regression

Making sense of result parameters

Implementing linear regression with Python

Model validation

Handling other issues in linear regression

Summary

Chapter 6: Logistic Regression with Python

Linear regression versus logistic regression

Understanding the math behind logistic regression

Implementing logistic regression with Python

Model validation and evaluation

Model validation

Summary

Chapter 7: Clustering with Python

Introduction to clustering – what, why, and how?

Mathematics behind clustering

Implementing clustering using Python

Fine-tuning the clustering

Summary

Chapter 8: Trees and Random Forests with Python

Introducing decision trees

Understanding the mathematics behind decision trees

Implementing a decision tree with scikit-learn

Understanding and implementing regression trees

Understanding and implementing random forests

Summary

Chapter 9: Best Practices for Predictive Modelling

Best practices for coding

Best practices for data handling

Best practices for algorithms

Best practices for statistics

Best practices for business contexts

Summary

Appendix: A List of Links

Module 2: Mastering Predictive Analytics with Python

Chapter 1: From Data to Decisions – Getting Started with Analytic Applications

Designing an advanced analytic solution

Case study: sentiment analysis of social media feeds

Case study: targeted e-mail campaigns

Summary

Chapter 2: Exploratory Data Analysis and Visualization in Python

Exploring categorical and numerical data in IPython

Time series analysis

Working with geospatial data

Introduction to PySpark

Summary

Chapter 3: Finding Patterns in the Noise – Clustering and Unsupervised Learning

Similarity and distance metrics

Affinity propagation – automatically choosing cluster numbers

k-medoids

Agglomerative clustering

Streaming clustering in Spark

Summary

Chapter 4: Connecting the Dots with Models – Regression Methods

Linear regression

Tree methods

Scaling out with PySpark – predicting year of song release

Summary

Chapter 5: Putting Data in its Place – Classification Methods and Analysis

Logistic regression

Fitting the model

Evaluating classification models

Separating Nonlinear boundaries with Support vector machines

Comparing classification methods

Case study: fitting classifier models in pyspark

Summary

Chapter 6: Words and Pixels – Working with Unstructured Data

Working with textual data

Principal component analysis

Images

Case Study: Training a Recommender System in PySpark

Summary

Chapter 7: Learning from the Bottom Up – Deep Networks and Unsupervised Features

Learning patterns with neural networks

The TensorFlow library and digit recognition

Summary

Chapter 8: Sharing Models with Prediction Services

The architecture of a prediction service

Clients and making requests

Server – the web traffic controller

Persisting information with database systems

Case study – logistic regression service

Summary

Chapter 9: Reporting and Testing – Iterating on Analytic Systems

Checking the health of models with diagnostics

Iterating on models through A/B testing

Guidelines for communication

Summary

Bibliography

Index

Thankspage

The users who browse this book also browse