丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

An Introduction to Hidden Markov Models

互联网

855
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

This unit introduces the concept of hidden Markov models in computational biology. It describes them using simple biological examples, requiring as little mathematical knowledge as possible. The unit also presents a brief history of hidden Markov models and an overview of their current applications before concluding with a discussion of their limitations.

Keywords: Markov Chains; HMM; hidden Markov Models; Machine Learning; Sequence Analysis

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • History
  • Common Applications
  • Markov Models
  • Hidden Markov Models
  • Profile Methods for Sequence Analysis
  • Pair HMMs
  • Drawbacks
  • Conclusions
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure Figure a0.3A.1 A simple Markov model for the transition probabilities as defined by a nucleotide substitution matrix. (A) Example of a short DNA sequence alignment with the transition depicted in green.(B) A simple Markov model for DNA residue substitutions. Every circle represents a state, and arrows denote probabilities to make a transition into the state at the end of the arrow in the next time step. A stochastic process like this is called a Markov chain if the transition probabilities p do not change depending on the states visited in the past. The transition observed in Panel (A) is highlighted in green. (C) A transition matrix for the model shown in (B).
    View Image
  •   Figure Figure a0.3A.2 A simple hidden Markov model for splice site recognition (Eddy, ). (A) The circles again denote states, in this case exon (E), intron (I) or 5′ splice site (5). The edges represent transition probabilities to move from one state to another in each time step. “Time” in this context refers to the residue position we are in, not actual time. The sequence of visited states cannot be observed, rather a nucleotide distribution that is typical for one of the 3 states is seen, here shown as a bar shaded in 4 colors. The proportion of each color reflects the probability of observing this nucleotide when the state is visited. (B) An example of a sequence (bottom) and its hidden state path (top, bold font). Passing through the model and emitting a residue according to the given emission probabilities creates a sequence like the one shown. Vice‐versa, one can calculate the probability that an observed sequence was created by the given state path.
    View Image
  •   Figure Figure a0.3A.3 Schematic representation of a profile hidden Markov model of length 4. The states are begin (B), end (E), match (M), insert (I), and delete (D). Rectangular states (M and I) emit amino acids or nucleotides, round states are silent. A sequence family is represented by a characteristic distribution of amino‐acid emission probabilities at every match or insert state. Delete states are equivalent to gaps.
    View Image
  •   Figure Figure a0.3A.4 The Leucine Rich Repeat family as a multiple sequence alignment and as an HMM. The columns of the HMM and the positions in the profile HMM that correspond to them are shown by black bars. The bottom part shows an HMM logo. HMM logos resemble sequence logos (Schuster‐Böckler et al., ). Each position in the HMM corresponds to a column in the logo. Match states are white, insert states are red. The width of the column reflects how likely it is to skip it by going through the delete state. The height of the letters shows their frequency relative to the overall information content of the state. It can be seen how the information in the HMM mirrors the composition of the sequence alignment.
    View Image
  •   Figure Figure a0.3A.5 A pair HMM for global pairwise alignment with affine gap penalties as described by Durbin et al. ().
    View Image

Videos

Literature Cited

Literature Cited
   Baldi, P., Chauvin, Y., Hunkapiller, T., and McClure, M.A. 1994. Hidden Markov models of biological primary sequence information. PNAS 91:1059‐1063.
   Baum, L.E. and Petrie, T. 1966. Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Stat. 37:1554‐1563.
   Birney, E., Clamp, M., and Durbin, R., 2004. GeneWise and Genomewise. Genome Res. 14:988‐995.
   Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268:78‐94.
   Durbin, R., Eddy, S.R., Krogh, A., and Mitchison, G. 1998. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, U.K.
   Eddy, S.R. 1996. Hidden Markov models. Curr. Opin. Struct. Biol. 6:361‐365.
   Eddy, S.R. 2004. What is a hidden Markov model? Nat. Biotechnol. 22:1315‐1316.
   Kulp, D., Haussler, D., Reese, M.G., and Eeckman, F.H. 1996. A generalized hidden Markov model for the recognition of human genes in DNA. Proc. Int. Conf. Intell. Syst. Mol. Biol. 4:134‐142.
   Lukashin, A.V. and Borodovsky, M. 1998. GeneMark.hmm: New solutions for gene finding. Nucl. Acids Res. 26:1107‐1115.
   Madera, M. 2005. Hidden Markov models for detection of remote homology. PhD thesis, University of Cambridge, MRC Laboratory of Molecular Biology, May 2005.
   Meyer, I.M. and Durbin, R. 2004. Gene structure conservation aids similarity based gene prediction. Nucl. Acids Res. 32:776‐783.
   Rabiner, L.R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77:257‐286.
   Schuster‐Böckler, B. and Bateman, A. 2005. Visualizing profile‐profile alignment: Pairwise HMM logos. Bioinformatics 21:2912‐2913.
   Schuster‐Böckler, B., Schultz, J., and Rahmann, S. 2004. HMM Logos for visualization of protein families. BMC Bioinformatics 5:7.
   Söding, J. 2005. Protein homology detection by HMM–HMM comparison. Bioinformatics 21:951‐960.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
ad image
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
扫码领资料
反馈
TOP
打开小程序