Using PEBBLE for the Evolutionary Analysis of Serially Sampled Molecular Sequences
互联网
- Abstract
- Table of Contents
- Materials
- Figures
- Literature Cited
Abstract
The PEBBLE (Phylogenetics, Evolutionary Biology, and Bioinformatics in a moduLar Environment) application is a relative newcomer to the field of phylogenetic applications. Although designed as a customizable generalist application, PEBBLE was initially developed to implement procedures for the analysis of sequences associated with different sampling times, e.g., rapidly evolving viral genes sampled over the course of infection, or ancient DNA sequences. The basic protocol describes the use of PEBBLE to infer a phylogenetic tree using the sUPGMA algorithm, and the inference of substitution rate parameters using maximum likelihood. The alternate and support protocols describe the simulation capabilities of PEBBLE, and general use of the PEBBLE application, respectively.
Table of Contents
- Basic Protocol 1: Using PEBBLE for Analysis
- Support Protocol 1: General Use of PEBBLE
- Alternate Protocol 1: Using PEBBLE for Simulation
- Guidelines for Understanding Results
- Commentary
- Literature Cited
- Figures
Materials
Basic Protocol 1: Using PEBBLE for Analysis
Necessary Resources
|
Figures
-
Figure 6.8.1 The main window of the PEBBLE application, running under the Microsoft Windows operating system. Running PEBBLE under other operating systems may produce minor differences. On the left is the object store, on the right is the object view area. View Image -
Figure 6.8.2 The main window after having imported an example alignment, with that alignment now selected. A view of the example alignment is shown in the “object view” window. View Image -
Figure 6.8.3 The “new substitution model” ceblet. The rate matrix is chosen by using the drop‐down menu in the top left of the window. The rate matrix parameters are entered in the upper center of the window. The equilibrium frequencies are entered in the lower left area, and the gamma distribution is defined in the lower right of the window. View Image -
Figure 6.8.4 The “sample information input” ceblet. Sequence names are listed on the left, along with any time/order information attributed to each sequence. The right side contains controls for selecting time or order information. View Image -
Figure 6.8.5 The sUPGMA ceblet, waiting for various options to be selected. The clustering method and number of nonparametric bootstraps are selected on the left. The rate models and the parametric bootstrapping parameters are selected on the right. View Image -
Figure 6.8.6 The result of using the sUPGMA analysis on the example data, described in the text. A single substitution rate was assumed as well as a single within‐sample divergence parameter. View Image -
Figure 6.8.7 The maximum likelihood estimation ceblet for serially sampled data. The top left area in the window allows the user to select the subject of optimization; only Tree is available, as the model selected (JC69) does not have any free parameters. The top right area is for selecting a substitution model (this does not appear if a model was given from the object store). The lower left area allows the user to select the sophistication of the optimizer method, and the lower right area is where the assumptions regarding substitution rate are selected. View Image -
Figure 6.8.8 The maximum likelihood visualization tool in mid‐optimization. The branch lengths of the tree are updated in “real time” as the optimization algorithm searches for the maximum likelihood estimates. View Image -
Figure 6.8.9 For ceblets that require one or more input objects, the first phase of ceblet execution will be object input. In this example the user is prompted for a Tree object and an Alignment object, which must be selected from the object store. View Image -
Figure 6.8.10 A simple example of user interaction within a ceblet. The user is being prompted to select the requested operation from a short list. To continue to the next phase of ceblet execution the user would click on the Next button. View Image -
Figure 6.8.11 The last phase of Ceblet execution prompts the user to annotate any results generated. In this case, an alignment has been created, and a default name is presented to the user for editing. When the user has supplied a name, and, optionally, a description, clicking the Finish button will result in the Alignment object being added to the object store for later use. View Image -
Figure 6.8.12 The Serial Coalescent Simulator ceblet is depicted in this figure. With the options as shown, the result will be one tree with 15 taxonomic units. View Image -
Figure 6.8.13 The Alignment Simulator ceblet allows the user to simulate an alignment across one or more input trees. In this particular use of this ceblet, the root sequence has been selected as TAGCAT. View Image
Videos
Literature Cited
Barnes, I., Matheus, P., Shapiro, B., Jensen, D., and Cooper, A. 2002. Dynamics of Pleistocene population extinctions in Beringian brown bears. Science 295:2267‐2270. | |
Drummond, A. and Rodrigo, A.G. 2000. Reconstructing genealogies of serial samples under the assumption of a molecular clock using serial‐sample UPGMA (sUPGMA). Mol. Biol. Evol. 17:1807‐1815. | |
Drummond, A. and Strimmer, K. 2001. PAL: An object‐oriented programming library for molecular evolution and phylogenetics. Bioinformatics 17:662‐663. | |
Drummond, A., Forsberg, R., and Rodrigo, A.G. 2001. Estimating stepwise changes in substitution rates using serial samples. Mol. Biol. Evol. 18:1365‐1371. | |
Drummond, A., Nicholls, G.K., Rodrigo, A.G., and Solomon, W. 2002. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161:1307‐1320. | |
Drummond, A.J., Pybus, O.G., Rambaut, A., Forsberg, R., and Rodrigo, A.G. 2003. Measurably evolving populations. Trends Ecol. Evol. 18:481‐488. | |
Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17:368‐376. | |
Forsberg, R., Oleksiewicz, M.B., Petersen, A.M.K., Hein, J., Botner, A., and Storgaard, T. 2001. A molecular clock dates the common ancestor of European‐type porcine reproductive and respiratory syndrome virus at more than 10 years before the emergence of disease. Virology 289:174‐179. | |
Fu, Y.X. 2001. Estimating mutation rate and generation time from longitudinal samples of DNA sequences. Mol. Biol. Evol. 18:620‐626. | |
Goldman, N. 1990. Maximum likelihood inferences of phylogenetic trees, with special reference to the Poisson process model of DNA substitutions and to parsimony analysis. Syst. Zool. 39:345‐361. | |
Huelsenbeck, J.P., Hillis, D.M., and Jones, R. 1996. Parametric bootstrapping in molecular phylogenetics: Applications and performance. In Molecular Zoology: Advances, Strategies and Protocols (J. D. Ferraris, and S. R. Palumbi, eds.) pp. 19‐45. John Wiley & Sons, New York. | |
Jukes, T. and Cantor, C. 1969. Evolution of protein molecules. In Mammalian Protein Metabolism, Volume III (H. Munro, ed.) pp. 21‐132. Academic Press, New York. | |
Kalbfleisch, J.G. 1985. Probability and Statistical Inference. Springer‐Verlag, New York. | |
Kingman, J.F.C. 1982. The coalescent. Stochastic Process. Appl. 13:235‐248. | |
Lambert, D.M., Ritchie, P.A., Millar, C.D., Holland, B., Drummond, A.J., and Baroni, C. 2002. Rates of evolution in ancient DNA from Adelie penguins. Science 295:2270‐2273. | |
Leitner, T. and Albert, J. 1999. The molecular clock of HIV‐1 unveiled through analysis of a known transmission history. Proc. Natl. Acad. Sci. U.S.A. 96:10752‐10757. | |
Leonard, J.A., Wayne, R.K., and Cooper, A. 2000. Population genetics of Ice Age brown bears. Proc. Natl. Acad. Sci. U.S.A. 97:1651‐1654. | |
Rambaut, A. 2000. Estimating the rate of molecular evolution: Incorporating non‐contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 16:395‐399. | |
Rambaut, A. and Bromham, L. 1998. Estimating divergence rates from molecular sequences. Mol. Biol. Evol. 15:442‐448. | |
Rodrigo, A.G. and Felsenstein, J. 1999. Coalescent approaches to HIV population genetics. In The Evolution of HIV (K.A. Crandall, ed.) pp. 223‐271. Johns Hopkins University Press, Baltimore. | |
Rodrigo, A.G., Shpaer, E.G., Delwart, E.L., Iverson, A.K.N., Gallo, M.V., Brojatsch, J., Hirsch, M.S., Walker, B.D., and Mullins, J.I. 1999. Coalescent estimates of HIV‐1 generation time in vivo. Proc. Natl. Acad. Sci. U.S.A. 96:2187‐2191. | |
Rodriguez, F., Oliver, J.F., Marin, A., and Medina, J.R. 1990. The general stochastic model of nucleotide substitution. J. Theor. Biol. 142:485‐501. | |
Shankarappa, R., Margolick, J.B., Gange, S.J., Rodrigo, A.G., Upchurch, D., Farzadegan, H., Gupta, P., Rinaldo, C.R., Learn, G.H., He, X., Huang, X.L., and Mullins, J.I. 1999. Consistent viral evolutionary dynamics associated with the progression of HIV‐1 infection. J. Virol. 73:10489‐10502. | |
Sneath, P.H.A. and Sokal, R.R. 1973. Numerical Taxonomy. W.H. Freeman, San Francisco. | |
Steel, M. and McKenzie, A. 2000. Properties of phylogenetic trees generated by Yule‐type speciation models. Math. Biosci. 170:91‐112. | |
Yang, Z. 1994. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. J. Mol. Evol. 39:306‐314. | |
Internet Resources | |
http://www.cebl.auckland.ac.nz | |
The PEBBLE application can be downloaded from the above URL by following the Software link. Notification of updates and milestone releases of the PEBBLE application can be obtained via E‐mail by sending an E‐mail to pebble_notice-subscribe@yahoogroups.com. Bug reports regarding the PEBBLE application may be sent to pebble-bugs@yahoogroups.com. | |
http://www.cebl.auckland.ac.nz/pal‐project | |
The home page of the PAL project can be found at the above URL. |