Chapter
2.4 Dynamic programming with more complex models
Alignment with affine gap scores
2.5 Heuristic alignment algorithms
2.6 Linear space alignments
2.7 Significance of scores
The Bayesian approach: model comparison
The classical approach: the extreme value distribution
Why use the alignment score as the test statistic?
2.8 Deriving score parameters from alignment data
3 Markov chains and hidden Markov models
Modelling the beginning and end of sequences
Using Markov chains for discrimination
Formal definition of an HMM
Most probable state path: the Viterbi algorithm
The backward algorithm and posterior state probabilities
3.3 Parameter estimation for HMMs
Estimation when the state sequence is known
Estimation when paths are unknown: Baum–Welch and Viterbi training
Modelling of labelled sequences
Discriminative estimation
3.5 More complex Markov chains
Finding prokaryotic genes
Inhomogeneous Markov chains
3.6 Numerical stability of HMM algorithms
4 Pairwise alignment using HMMs
The most probable path is the optimal FSA alignment
A pair HMM for local alignment
4.2 The full probability of x and y, summing over all paths
Probabilistic sampling of alignments
Finding distinct suboptimal alignments
4.4 The posterior probability that x[sub(i)] is aligned to y[sub(j)]
The expected accuracy of an alignment
4.5 Pair HMMs versus FSAs for searching
5 Profile HMMs for sequence families
5.1 Ungapped score matrices
5.2 Adding insert and delete states to obtain profile HMMs
Profile HMMs generalise pairwise alignment
5.3 Deriving profile HMMs from multiple alignments
Non-probabilistic profiles
Basic profile HMM parameterisation
5.4 Searching with profile HMMs
Alternatives to log-odds scoring
5.5 Profile HMM variants for non-global alignments
5.6 More on estimation of probabilities
Substitution matrix mixtures
Deriving P(b|a) from an arbitrary matrix
Estimation based on an ancestor
Testing the pseudocount methods
5.7 Optimal model construction
MAP match–insert assignment
5.8 Weighting training sequences
Simple weighting schemes derived from a tree
Root weights from Gaussian parameters
Maximum discrimination weights
6 Multiple sequence alignment methods
6.1 What a multiple alignment means
6.2 Scoring a multiple alignment
6.3 Multidimensional dynamic programming
6.4 Progressive alignment methods
Feng–Doolittle progressive multiple alignment
Iterative refinement methods
6.5 Multiple alignment by profile HMM training
Multiple alignment with a known profile HMM
Overview of profile HMM training from unaligned sequences
Baum–Welch expectation maximisation
Theoretical basis of simulated annealing
Noise injection during Baum–Welch HMM re-estimation
Simulated annealing Viterbi estimation of HMMs
Comparison to Gibbs sampling
Adaptively modifying model architecture; model surgery
7 Building phylogenetic trees
Counting and labelling trees
7.3 Making a tree from pairwise distances
Clustering methods: UPGMA
Molecular clocks and the ultrametric property of distances
Additivity and neighbour-joining
Selecting labelled branching patterns by branch and bound
7.5 Assessing the trees: the bootstrap
7.6 Simultaneous alignment and phylogeny
Sankoff & Cedergren’s gap-substitution algorithm
Hein’s affine cost algorithm
Limitations of Hein’s model
7.8 Appendix: proof of neighbour-joining theorem
8 Probabilistic approaches to phylogeny
Overview of the probabilistic approach to phylogeny
8.2 Probabilistic models of evolution
8.3 Calculating the likelihood for ungapped alignments
The case of two sequences
The likelihood for an arbitrary number of sequences
Reversibility and independence of root position
8.4 Using the likelihood for inference
Maximising the likelihood
Sampling from the posterior distribution
A proposal distribution for phylogenetic trees
Other phylogenetic uses of sampling
8.5 Towards more realistic evolutionary models
Allowing different rates at different sites
Evolutionary models with gaps
Evaluating different probabilistic models
8.6 Comparison of probabilistic and non-probabilistic methods
A probabilistic interpretation of parsimony
Maximum likelihood and pairwise distance methods
A probabilistic interpretation of Sankoff & Cedergren
Interpreting Hein’s algorithm probabilistically
9 Transformational grammars
9.1 Transformational grammars
Definition of a transformational grammar
Deterministic vs. nondeterministic automata
What a regular grammar can’t do
9.3 Context-free grammars
9.4 Context-sensitive grammars
Unrestricted grammars and Turing machines
Stochastic context-sensitive or unrestricted grammars
Hidden Markov models are stochastic regular grammars
9.6 Stochastic context-free grammars for sequence modelling
Normal forms for stochastic context-free grammars
Parameter re-estimation by expectation maximisation
The CYK alignment algorithm
Summary of SCFG algorithms
10 RNA structure analysis
Terminology of RNA secondary structure
RNA sequence evolution is constrained by structure
Inferring structure by comparative sequence analysis
10.2 RNA secondary structure prediction
Base pair maximisation and the Nussinov folding algorithm
An SCFG version of the Nussinov algorithm
Energy minimisation and the Zuker folding algorithm
Base pair confidence estimates
10.3 Covariance models: SCFG-based RNA profiles
SCFG models of ungapped RNA alignments
Design of covariance models
Construction of a CM from an RNA alignment
Scoring: inside algorithm
Outside algorithm for CMs
Inside–outside expectation maximisation for CM parameters
Database searching: the CYK algorithm
Structural alignment: CYK algorithm with traceback
‘Automated’ comparative sequence analysis using CMs
An example of a practical application of CM algorithms
11 Background on probability
11.1 Probability distributions
The binomial distribution
The Gaussian distribution
The multinomial distribution
The Dirichlet distribution
The extreme value distribution
Relative entropy and mutual information
The posterior probability distribution
Sampling by transformation from a uniform distribution
Sampling from a Dirichlet by rejection
Sampling with the Metropolis algorithm
11.5 Estimation of probabilities from counts
EM explanation of the Baum–Welch algorithm