Using OrthoCluster for the Detection of Synteny Blocks Among Multiple Genomes

互联网2013-12-31

1129

Abstract
Table of Contents
Figures
Literature Cited

Abstract

Synteny blocks are composed of two or more orthologous genes conserved among species, resulting from speciation from their last common ancestor. OrthoCluster (Zeng et al., 2008) is a fast and easy?to?use program for the identification of synteny blocks among multiple genomes. It allows users to identify synteny blocks that contain different types of mismatches, and to decide whether they require conservation of gene orientation and conservation of gene order within the blocks. OrthoCluster can also be used to find duplicated blocks within genomes. Although genes and their correspondence are usually used as input for OrthoCluster, in fact, OrthoCluster can be applied using any type of markers as input as long as their relationships can be established. OrthoClusterDB provides a Web interface for running OrthoCluster with user?defined datasets and parameters, as well as for browsing and downloading precomputed synteny blocks for different groups of genomes. Curr. Protoc. Bioinform. 27:6.10.1?6.10.18. © 2009 by John Wiley & Sons, Inc.

Keywords: OrthoCluster; OrthoClusterDB; InParanoid; MultiParanoid; genome painter; GBrowse; segmental duplication

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Files
Basic Protocol 1: Using OrthoCluster via Web Interface with User‐Defined Datasets and Parameters
Basic Protocol 2: Using OrthoCluster via Web Interface with Predefined Datasets and Parameters
Basic Protocol 3: Using OrthoCluster Locally for the Detection of Synteny Blocks
Alternate Protocol 1: Using OrthoCluster for the Detection of Segmental Duplications
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 6.10.1 Part of the genome file for C. elegans (genome_ele.txt).

View Image

Figure 6.10.2 Part of the correspondence file between C. elegans and C. briggsae (mapping_ele_bri.txt).

View Image

Figure 6.10.3 User‐modifiable parameters available in the Run OrthoCluster page of OrthoClusterDB. Parameters can be divided into those that handle order and strandedness, synteny block size, mismatches, and the output of blocks generated by OrthoCluster. The default values of the parameters are for detecting perfect synteny blocks.

View Image

Figure 6.10.4 Genome Painter image output after running OrthoCluster from the Run OrthoCluster page. The main content of this page is a summary table with information about the OrthoCluster run, and the Genome Painter image displaying the location of synteny blocks in the target genome(s) with respect to the reference genome. Additional information describing the steps done by OrthoCluster to generate the results is also available. In this example, the information displayed corresponds to the result of running OrthoCluster between C. elegans and C. briggsae for the detection of perfectly conserved synteny blocks.

View Image

Figure 6.10.5 GBrowse view of a synteny block. The region displayed corresponds to the largest perfect synteny block found between the genomes of C. elegans and C. briggsae .

View Image

Figure 6.10.6 View Synteny page on the OrthoClusterDB Web site. Five groups of species are available for visualization of precomputed synteny blocks. In this example, the group selected is Caenorhabditis , the reference genome is C. elegans , and the target genome is C. briggsae . Perfect and imperfect precomputed synteny blocks are available (see text). For this example, visualization of perfect synteny blocks is selected.

View Image

Figure 6.10.7 Structure of the common header of OrthoCluster files.

View Image

Figure 6.10.8 Example of this header for comparing two genomes, for finding nonoverlapping blocks (‐f) with a minimum block size of 1 (‐l 1), a maximum block size of 1000 (‐u 1000), with no mismatche, and preserving the four different types of order (‐r) and strandedness (‐s).

View Image

Figure 6.10.9 Format of the first line of each block (.cluster file) given N genomes.

View Image
Figure 6.10.10 Format line for each gene conforming a block.

View Image
Figure 6.10.11 A synteny block generated by running the standalone OrthoCluster program.

View Image

Videos

Literature Cited

Literature Cited
	Alexeyenko, A., Tamas, I., Liu, G., and Sonnhammer, E.L. 2006. Automatic clustering of orthologs and in paralogs shared by multiple proteomes. Bioinformatics 22:e9‐e15.
	Flicek, P., Aken, B.L., Beal, K., Ballester, B., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cunningham, F., Cutts, T., Down, T., Dyer, S.C., Eyre, T., Fitzgerald, S., Fernandez‐Banet, J., Gräf, S., Haider, S., Hammond, M., Holland, R., Howe, K.L., Howe, K., Johnson, N., Jenkinson, A., Kähäri, A., Keefe, D., Kokocinski, F., Kulesha, E., Lawson, D., Longden, I., Megy, K., Meidl, P., Overduin, B., Parker, A., Pritchard, B., Prlic, A., Rice, S., Rios, D., Schuster, M., Sealy, I., Slater, G., Smedley, D., Spudich, G., Trevanion, S., Vilella, A.J., Vogel, J., White, S., Wood, M., Birney, E., Cox, T., Curwen, V., Durbin, R., Fernandez‐Suarez, X.M., Herrero, J., Hubbard, T.J., Kasprzyk, A., Proctor, G., Smith, J., Ureta‐Vidal, A., and Searle, S. 2008. Ensembl 2008. Nucleic Acids Res 36:D707‐D714.
	Housworth, E.A. and Postlethwait, J. 2002. Measures of synteny conservation between species pairs. Genetics 162:441‐448.
	Li, L., Stoeckert, C.J. Jr., and Roos, D.S. 2003. OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res. 13:2178‐2189.
	Ng, M.P., Vergara, I.A., Frech, C., Chen, Q., Zeng, X., Pei, J., and Chen, N. 2009. OrthoClusterDB: An online platform for synteny blocks. BMC Bioinform. 10:192.
	O'Brien, K.P., Remm, M., and Sonnhammer, E.L. 2005. Inparanoid: A comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 33:D476‐D480.
	Remm, M., Storm, C.E., and Sonnhammer, E.L. 2001. Automatic clustering of orthologs and in‐paralogs from pairwise species comparisons. J. Mol. Biol. 314:1041‐1052.
	Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The generic genome browser: A building block for a model organism system database. Genome Res. 12:1599‐1610.
	Vergara, I.A., Mah, A.K., Huang, J.C., Tarailo‐Graovac, M., Johnsen, R.C., Baillie, D.L., and Chen, N. 2009. Polymorphic segmental duplication in the nematode Caenorhabditis elegans. BMC Genomics 10:329.
	Zeng, X., Pei, J., Vergara, I.A., Nesbitt, M., Wang, K., and Chen, N. 2008. OrthoCluster: A new tool for mining synteny blocks and applications in comparative genomics. In 11th International Conference on Extending Database Technology (EDBT), March 25‐30, 2008, Nantes, France. Association for Computer Machinery, New York.