Practical Data Analysis - Second Edition

Author： Hector Cuesta;Dr. Sampath Kumar

Publisher： Packt Publishing‎

Publication year： 2016

E-ISBN: 9781785286667

P-ISBN(Paperback): 9781785289712

Subject： TP39 computer application

Keyword：计算机的应用,自动化技术、计算机技术

Language： ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Chapter

Chapter 1: Getting Started

[Computer science]

Computer science

Artificial intelligence

Machine learning

Statistics

Mathematics

Knowledge domain

Data, information, and knowledge

Inter-relationship between data, information, and knowledge

The nature of data

The data analysis process

The problem

Data preparation

Data exploration

Predictive modeling

Visualization of results

Quantitative versus qualitative data analysis

Importance of data visualization

What about big data?

Quantified self

Sensors and cameras

Social network analysis

Tools and toys for this book

Why Python?

Why mlpy?

Why D3.js?

Why MongoDB?

Summary

Chapter 2: Preprocessing Data

Data sources

Open data

Text files

Excel files

SQL databases

NoSQL databases

Multimedia

Web scraping

Data scrubbing

Statistical methods

Text parsing

Data transformation

Data formats

Parsing a CSV file with the CSV module

Parsing CSV file using NumPy

JSON

Parsing JSON file using the JSON module

XML

Parsing XML in Python using the XML module

YAML

Data reduction methods

Filtering and sampling

Binned algorithm

Dimensionality reduction

Getting started with OpenRefine

Text facet

Clustering

Text filters

Numeric facets

Transforming data

Exporting data

Operation history

Summary

Chapter 3: Getting to Grips with Visualization

What is visualization?

Working with web-based visualization

Exploring scientific visualization

Visualization in art

The visualization life cycle

Visualizing different types of data

HTML

DOM

CSS

JavaScript

SVG

Getting started with D3.js

Bar chart

Pie chart

Scatter plots

Single line chart

Multiple line chart

Interaction and animation

Data from social networks

An overview of visual analytics

Summary

Chapter 4: Text Classification

Learning and classification

Bayesian classification

NaÃ¯ve Bayes

E-mail subject line tester

The data

The algorithm

Classifier accuracy

Summary

Chapter 5: Similarity-Based Image Retrieval

Image similarity search

Dynamic time warping

Processing the image dataset

Implementing DTW

Analyzing the results

Summary

Chapter 6: Simulation of Stock Prices

Financial time series

Random Walk simulation

Monte Carlo methods

Generating random numbers

Implementation in D3js

Quantitative analyst

Summary

Chapter 7: Predicting Gold Prices

Working with time series data

Components of a time series

Smoothing time series

Lineal regression

The data – historical gold prices

Nonlinear regressions

Kernel Ridge Regressions

Smoothing the gold prices time series

Predicting in the smoothed time series

Contrasting the predicted value

Summary

Chapter 8: Working with Support Vector Machines

Understanding the multivariate dataset

Dimensionality reduction

Linear Discriminant Analysis (LDA)

Principal Component Analysis (PCA)

Getting started with SVM

Kernel functions

The double spiral problem

SVM implemented on mlpy

Summary

Chapter 9: Modeling Infectious Diseases with Cellular Automata

Introduction to epidemiology

The epidemiology triangle

The epidemic models

The SIR model

Solving the ordinary differential equation for the SIR model with SciPy

The SIRS model

Modeling with Cellular Automaton

Cell, state, grid, neighborhood

Global stochastic contact model

Simulation of the SIRS model in CA with D3.js

Summary

Chapter 10: Working with Social Graphs

Structure of a graph

Undirected graph

Directed graph

Social networks analysis

Acquiring the Facebook graph

Working with graphs using Gephi

Statistical analysis

Male to female ratio

Degree distribution

Histogram of a graph

Centrality

Transforming GDF to JSON

Graph visualization with D3.js

Summary

Chapter 11: Working with Twitter Data

The anatomy of Twitter data

Followers

The users who browse this book also browse

Chapter

The users who browse this book also browse

No browse record.