Chapter
Chapter 1: Getting Started
Data, information, and knowledge
Inter-relationship between data, information, and knowledge
The data analysis process
Quantitative versus qualitative data analysis
Importance of data visualization
Tools and toys for this book
Chapter 2: Preprocessing Data
Parsing a CSV file with the CSV module
Parsing CSV file using NumPy
Parsing JSON file using the JSON module
Parsing XML in Python using the XML module
Getting started with OpenRefine
Chapter 3: Getting to Grips with Visualization
Working with web-based visualization
Exploring scientific visualization
The visualization life cycle
Visualizing different types of data
Getting started with D3.js
Interaction and animation
Data from social networks
An overview of visual analytics
Chapter 4: Text Classification
Learning and classification
E-mail subject line tester
Chapter 5: Similarity-Based Image Retrieval
Processing the image dataset
Chapter 6: Simulation of Stock Prices
Generating random numbers
Chapter 7: Predicting Gold Prices
Working with time series data
Components of a time series
The data – historical gold prices
Smoothing the gold prices time series
Predicting in the smoothed time series
Contrasting the predicted value
Chapter 8: Working with Support Vector Machines
Understanding the multivariate dataset
Linear Discriminant Analysis (LDA)
Principal Component Analysis (PCA)
The double spiral problem
Chapter 9: Modeling Infectious Diseases with Cellular Automata
Introduction to epidemiology
The epidemiology triangle
Solving the ordinary differential equation for the SIR model with SciPy
Modeling with Cellular Automaton
Cell, state, grid, neighborhood
Global stochastic contact model
Simulation of the SIRS model in CA with D3.js
Chapter 10: Working with Social Graphs
Acquiring the Facebook graph
Working with graphs using Gephi
Graph visualization with D3.js
Chapter 11: Working with Twitter Data
The anatomy of Twitter data
Using OAuth to access Twitter API
Getting started with Twython
Simple search using Twython
Working with places and trends
Chapter 12: Data Processing and Aggregation with MongoDB
Getting started with MongoDB
Data transformation with OpenRefine
Inserting documents with PyMongo
Chapter 13: Working with MapReduce
Using MapReduce with MongoDB
Filtering the input collection
Counting the most common words in tweets
Chapter 14: Online Data Analysis with Jupyter and Wakari
Getting started with Wakari
Creating an account in Wakari
Getting started with IPython notebook
Introduction to image processing with PIL
Working with an image histogram
Getting started with pandas
Working with multivariate datasets with DataFrame
Grouping, Aggregation, and Correlation
Chapter 15: Understanding Data Processing using Apache Spark
Platform for data processing
An introduction to the distributed file system
First steps with Hadoop Distributed File System – HDFS
File management with HUE – web interface
An introduction to Apache Spark
The Spark programming model
An introductory working example of Apache Startup