Chapter
Module 1: Java for Data Science
Chapter 1: Getting Started with Data Science
Problems solved using data science
Understanding the data science problem - solving approach
Using Java to support data science
Acquiring data for an application
The importance and process of cleaning data
Visualizing data to enhance understanding
The use of statistical methods in data science
Machine learning applied to data science
Using neural networks in data science
Visual and audio analysis
Improving application performance using parallel techniques
Chapter 2: Data Acquisition
Understanding the data formats used in data science applications
Overview of streaming data
Overview of audio/video/images in Java
Data acquisition techniques
Using the HttpUrlConnection class
Creating your own web crawler
Using the crawler4j web crawler
Using API calls to access common social media sites
Using OAuth to authenticate users
Handling Excel spreadsheets
The nitty gritty of cleaning text
Using Java tokenizers to extract words
Third-party tokenizers and libraries
Transforming data into a usable form
Finding and replacing text
Validating e-mail addresses
Changing the contrast of an image
Converting images to different formats
Chapter 4: Data Visualization
Understanding plots and graphs
Using country as the category
Using decade as the category
Chapter 5: Statistical Data Analysis Techniques
Working with mean, mode, and median
Using simple Java techniques to find mean
Using Java 8 techniques to find mean
Using Google Guava to find mean
Using Apache Commons to find mean
Using simple Java techniques to find median
Using Apache Commons to find the median
Using ArrayLists to find multiple modes
Using a HashMap to find multiple modes
Using a Apache Commons to find multiple modes
Sample size determination
Using simple linear regression
Using multiple regression
Chapter 6: Machine Learning
Supervised learning techniques
Using a decision tree with a book dataset
Testing the book decision tree
Using an SVM for camping data
Testing individual instances
Unsupervised machine learning
Association rule learning
Using association rule learning to find buying relationships
Chapter 7: Neural Networks
Training a neural network
Getting started with neural network architectures
Understanding static neural networks
Understanding dynamic neural networks
Multilayer perceptron networks
Saving and retrieving the model
Learning vector quantization
Displaying the SOM results
Additional network architectures and algorithms
The k-Nearest Neighbors algorithm
Instantaneously trained networks
Cascading neural networks
Holographic associative memory
Backpropagation and neural networks
Deeplearning4j architecture
Acquiring and manipulating data
Configuring and building a model
Using hyperparameters in ND4J
Instantiating the network model
Deep learning and regression analysis
Reading and preparing the data
Restricted Boltzmann Machines
Building an autoencoder in DL4J
Building and training the network
Saving and retrieving a network
Recurrent Neural Networks
Implementing named entity recognition
Using OpenNLP to perform NER
Identifying location entities
Classifying text by labels
Classifying text by similarity
Understanding tagging and POS
Using OpenNLP to identify POS
Extracting relationships from sentences
Using OpenNLP to extract relationships
Downloading and extracting the Word2Vec model
Building our model and classifying text
Chapter 10: Visual and Audio Analysis
Getting information about voices
Gathering voice information
Understanding speech recognition
Using CMUPhinx to convert speech to text
Obtaining more detail about the words
Extracting text from an image
Using Tess4j to extract text
Using OpenCV to detect faces
Creating a Neuroph Studio project for classifying visual images
Chapter 11: Mathematical and Parallel Techniques for Data Analysis
Implementing basic matrix operations
Using GPUs with DeepLearning4j
Using Apache's Hadoop to perform map-reduce
Writing the reduce method
Creating and executing a new Hadoop job
Various mathematical libraries
Using the Apache Commons math API
Creating an Aparapi application
Using Aparapi for matrix multiplication
Understanding Java 8 lambda expressions and streams
Using Java 8 to perform matrix multiplication
Using Java 8 to perform map-reduce
Chapter 12: Bringing It All Together
Defining the purpose and scope of our application
Understanding the application's architecture
Data acquisition using Twitter
Understanding the TweetHandler class
Extracting data for a sentiment analysis model
Building the sentiment model
Processing the JSON input
Cleaning data to improve our results
Performing sentiment analysis
Other optional enhancements
Module 2: Mastering Java for Data Science
Chapter 1: Data Science Using Java
Natural Language Processing
Data science process models
Data processing libraries
Machine learning and data mining libraries
Chapter 2: Data Processing Toolbox
Extensions to the standard library
Search engine - preparing data
Chapter 3: Exploratory Data Analysis
Exploratory data analysis in Java
Interactive Exploratory Data Analysis in Java
Chapter 4: Supervised Learning - Classification and Regression
Binary classification models
Precision, recall, and F1
Training, validation, and testing
Case study - page prediction
Machine learning libraries for regression
Case study - hardware performance
Chapter 5: Unsupervised Learning - Clustering and Dimensionality Reduction
Unsupervised dimensionality reduction
Principal Component Analysis
Truncated SVD for categorical and sparse data
Clustering for supervised learning
Clustering as dimensionality reduction
Supervised learning via clustering
Chapter 6: Working with Text - Natural Language Processing and Information Retrieval
Natural Language Processing and information retrieval
Vector Space Model - Bag of Words and TF-IDF
Vector space model implementation
Indexing and Apache Lucene
Natural Language Processing tools
Customizing Apache Lucene
Machine learning for texts
Unsupervised learning for texts
Supervised learning for texts
Learning to rank for information retrieval
Chapter 7: Extreme Gradient Boosting
Gradient Boosting Machines and XGBoost
XGBoost for classification
XGBoost for learning to rank
Chapter 8: Deep Learning with DeepLearning4J
Neural Networks and DeepLearning4J
ND4J - N-dimensional arrays for Java
Neural networks in DeepLearning4J
Convolutional Neural Networks
Deep learning for cats versus dogs
Monitoring the performance
Running DeepLearning4J on GPU
Chapter 9: Scaling Data Science
Extracting features from the graph
Link Prediction with MLlib and XGBoost
Chapter 10: Deploying Data Science Models