Python: Real World Machine Learning

Author: Prateek Joshi;John Hearty;Bastiaan Sjardin;Luca Massaron;Alberto Boschetti  

Publisher: Packt Publishing‎

Publication year: 2016

E-ISBN: 9781787120679

P-ISBN(Paperback): 9781787123212

Subject: TP274 数据处理、数据处理系统;TP312 程序语言、算法语言

Keyword: 程序语言、算法语言,自动化技术、计算机技术,数据处理、数据处理系统

Language: ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Python: Real World Machine Learning

Description

Learn to solve challenging data science problems by building powerful machine learning models using Python About This Book • Understand which algorithms to use in a given context with the help of this exciting recipe-based guide • This practical tutorial tackles real-world computing problems through a rigorous and effective approach • Build state-of-the-art models and develop personalized recommendations to perform machine learning at scale Who This Book Is For This Learning Path is for Python programmers who are looking to use machine learning algorithms to create real-world applications. It is ideal for Python professionals who want to work with large and complex datasets and Python developers and analysts or data scientists who are looking to add to their existing skills by accessing some of the most powerful recent trends in data science. Experience with Python, Jupyter Notebooks, and command-line execution together with a good level of mathematical knowledge to understand the concepts is expected. Machine learning basic knowledge is also expected. What You Will Learn • Use predictive modeling and apply it to real-world problems • Understand how to perform market segmentation using unsupervised learning • Apply your new-found skills to solve real problems, through clearly-explained code for every technique and test • Compete with top data scientists by gaining a practical and theoretical understanding of cutting-edge deep learning algorithms • Increase predictive a

Chapter

Preprocessing data using different techniques

Label encoding

Building a linear regressor

Computing regression accuracy

Achieving model persistence

Building a ridge regressor

Building a polynomial regressor

Estimating housing prices

Computing the relative importance of features

Estimating bicycle demand distribution

Chapter 2: Constructing a Classifier

Introduction

Building a simple classifier

Building a logistic regression classifier

Building a Naive Bayes classifier

Splitting the dataset for training and testing

Evaluating the accuracy using cross-validation

Visualizing the confusion matrix

Extracting the performance report

Evaluating cars based on their characteristics

Extracting validation curves

Extracting learning curves

Estimating the income bracket

Chapter 3: Predictive Modeling

Introduction

Building a linear classifier using Support Vector Machine (SVMs)

Building a nonlinear classifier using SVMs

Tackling class imbalance

Extracting confidence measurements

Finding optimal hyperparameters

Building an event predictor

Estimating traffic

Chapter 4: Clustering with Unsupervised Learning

Introduction

Clustering data using the k-means algorithm

Compressing an image using vector quantization

Building a Mean Shift clustering model

Grouping data using agglomerative clustering

Evaluating the performance of clustering algorithms

Automatically estimating the number of clusters using DBSCAN algorithm

Finding patterns in stock market data

Building a customer segmentation model

Chapter 5: Building Recommendation Engines

Introduction

Building function compositions for data processing

Building machine learning pipelines

Finding the nearest neighbors

Constructing a k-nearest neighbors classifier

Constructing a k-nearest neighbors regressor

Computing the Euclidean distance score

Computing the Pearson correlation score

Finding similar users in the dataset

Generating movie recommendations

Chapter 6: Analyzing Text Data

Introduction

Preprocessing data using tokenization

Stemming text data

Converting text to its base form using lemmatization

Dividing text using chunking

Building a bag-of-words model

Building a text classifier

Identifying the gender

Analyzing the sentiment of a sentence

Identifying patterns in text using topic modeling

Chapter 7: Speech Recognition

Introduction

Reading and plotting audio data

Transforming audio signals into the frequency domain

Generating audio signals with custom parameters

Synthesizing music

Extracting frequency domain features

Building Hidden Markov Models

Building a speech recognizer

Chapter 8: Dissecting Time Series and Sequential Data

Introduction

Transforming data into the time series format

Slicing time series data

Operating on time series data

Extracting statistics from time series data

Building Hidden Markov Models for sequential data

Building Conditional Random Fields for sequential text data

Analyzing stock market data using Hidden Markov Models

Chapter 9: Image Content Analysis

Introduction

Operating on images using OpenCV-Python

Detecting edges

Histogram equalization

Detecting corners

Detecting SIFT feature points

Building a Star feature detector

Creating features using visual codebook and vector quantization

Training an image classifier using Extremely Random Forests

Building an object recognizer

Chapter 10: Biometric Face Recognition

Introduction

Capturing and processing video from a webcam

Building a face detector using Haar cascades

Building eye and nose detectors

Performing Principal Components Analysis

Performing Kernel Principal Components Analysis

Performing blind source separation

Building a face recognizer using Local Binary Patterns Histogram

Chapter 11: Deep Neural Networks

Introduction

Building a perceptron

Building a single layer neural network

Building a deep neural network

Creating a vector quantizer

Building a recurrent neural network for sequential data analysis

Visualizing the characters in an optical character recognition database

Building an optical character recognizer using neural networks

Chapter 12: Visualizing Data

Introduction

Plotting 3D scatter plots

Plotting bubble plots

Animating bubble plots

Drawing pie charts

Plotting date-formatted time series data

Plotting histograms

Visualizing heat maps

Animating dynamic signals

Module 2: Advanced Machine Learning with Python

Chapter 1: Unsupervised Machine Learning

Principal component analysis

Introducing k-means clustering

Self-organizing maps

Further reading

Summary

Chapter 2: Deep Belief Networks

Neural networks – a primer

Restricted Boltzmann Machine

Deep belief networks

Further reading

Summary

Chapter 3: Stacked Denoising Autoencoders

Autoencoders

Stacked Denoising Autoencoders

Further reading

Summary

Chapter 4: Convolutional Neural Networks

Introducing the CNN

Further Reading

Summary

Chapter 5: Semi-Supervised Learning

Introduction

Understanding semi-supervised learning

Semi-supervised algorithms in action

Further reading

Summary

Chapter 6: Text Feature Engineering

Introduction

Text feature engineering

Further reading

Summary

Chapter 7: Feature Engineering Part II

Introduction

Creating a feature set

Feature engineering in practice

Further reading

Summary

Chapter 8: Ensemble Methods

Introducing ensembles

Using models in dynamic applications

Further reading

Summary

Chapter 9: Additional Python Machine Learning Tools

Alternative development tools

Further reading

Summary

Appendix: Chapter Code Requirements

Module 3: Large Scale Machine Learning with Python

Chapter 1: First Steps to Scalability

Explaining scalability in detail

Python for large scale machine learning

Python packages

Summary

Chapter 2: Scalable Learning in Scikit-learn

Out-of-core learning

Streaming data from sources

Stochastic learning

Feature management with data streams

Summary

Chapter 3: Fast SVM Implementations

Datasets to experiment with on your own

Support Vector Machines

Feature selection by regularization

Including non-linearity in SGD

Hyperparameter tuning

Summary

Chapter 4: Neural Networks and Deep Learning

The neural network architecture

Neural networks and regularization

Neural networks and hyperparameter optimization

Neural networks and decision boundaries

Deep learning at scale with H2O

Deep learning and unsupervised pretraining

Deep learning with theanets

Autoencoders and unsupervised learning

Summary

Chapter 5: Deep Learning with TensorFlow

TensorFlow installation

Machine learning on TensorFlow with SkFlow

Keras and TensorFlow installation

Convolutional Neural Networks in TensorFlow through Keras

CNN's with an incremental approach

GPU Computing

Summary

Chapter 6: Classification and Regression Trees at Scale

Bootstrap aggregation

Random forest and extremely randomized forest

Fast parameter optimization with randomized search

CART and boosting

XGBoost

Out-of-core CART with H2O

Summary

Chapter 7: Unsupervised Learning at Scale

Unsupervised methods

Feature decomposition – PCA

PCA with H2O

Clustering – K-means

K-means with H2O

LDA

Summary

Chapter 8: Distributed Environments – Hadoop and Spark

From a standalone machine to a bunch of nodes

Setting up the VM

The Hadoop ecosystem

Spark

Summary

Chapter 9: Practical Machine Learning with Spark

Setting up the VM for this chapter

Sharing variables across cluster nodes

Data preprocessing in Spark

Machine learning with Spark

Summary

Appendix: Introduction to GPUs and Theano

GPU computing

Theano – parallel computing on the GPU

Installing Theano

Bibliography

The users who browse this book also browse