Using The Arabidopsis Information Resource (TAIR) to Find Information About Arabidopsis Genes

互联网2013-12-31

1746

Abstract
Table of Contents
Figures
Literature Cited

Abstract

The Arabidopsis Information Resource (TAIR; http://arabidopsis.org) is a comprehensive Web resource of Arabidopsis biology for plant scientists. TAIR curates and integrates information about genes, proteins, gene function, gene expression, mutant phenotypes, biological materials such as clones and seed stocks, genetic markers, genetic and physical maps, biochemical pathways, genome organization, images of mutant plants, protein sub?cellular localizations, publications, and the research community. The various data types are extensively interconnected and can be accessed through a variety of Web?based search and display tools. This unit primarily focuses on some basic methods for searching, browsing, visualizing, and analyzing information about Arabidopsis genes and describes several new tools such as a new TAIR genome browser (GBrowse), and the TAIR synteny viewer (GBrowse_syn). We also describe how to use AraCyc for mining plant metabolic pathways. Curr. Protoc. Bioinform. 30:1.11.1?1.11.51. © 2010 by John Wiley & Sons, Inc.

Keywords: Arabidopsis; databases; bioinformatics; data mining; genomics

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: TAIR Homepage, Sitemap, and Navigation
Basic Protocol 2: Finding Comprehensive Information about Arabidopsis Genes
Basic Protocol 3: Using the Arabidopsis Genome Browsers (SeqViewer and GBrowse)
Basic Protocol 4: Using the Gene Ontology Annotations: Finding Genes with Similar Functions
Basic Protocol 5: Finding and Ordering Mutant Seeds and cDNA Clones from the Stock Center
Basic Protocol 6: Using Public Microarray Data in TAIR
Support Protocol 1: Mapping Array Elements to Annotated Loci
Basic Protocol 7: Using the Motif Analysis Tool for Identifying Potential cis‐Regulatory Motifs in Upstream Sequences
Basic Protocol 8: Obtaining Information about Arabidopsis Metabolic Pathways
Basic Protocol 9: Displaying Metabolic Data Using the “OMICS” Viewer
Guidelines for Understanding Results
Commentary
Literature Cited
Figures

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 1.11.1 TAIR's home page (http://arabidopsis.org) is the main entry point to the database and Web site.

View Image

Figure 1.11.2 A sample of a locus page from TAIR showing the major data included in the detail page. A portion of the germplasm section has been deleted for simplicity. Each of the data types displayed in the alternating colored bands can be grouped into one or more the following categories: (A ) (a) general descriptive locus information, (b) gene model information, (c) functional annotations, (d) gene expression data; (B ) (e) nucleotide sequences, (f) protein data, (g) mapping data, (h) polymorphisms and alleles, (i) germplasm information; (C ) (j) links to resources outside of TAIR, (k) comments about the locus, and (l ) papers and abstracts.

View Image

Figure 1.11.3 SeqViewer home page after submitting the gene name AT1G07780 as a query term. The five nuclear chromosomes are shown as green lines with blue boxes indicating the location of the centromeres (a), a few markers are included as landmarks for orientation. Queries can be typed, pasted, or uploaded into the text input box (b). The available options include searches by name or sequence. The number of matches is displayed above the chromosomes (in this example this number is 4) and is hyperlinked to a list of results. Each match to the genome is indicated with a red tick mark on the chromosomes; clicking on the mark will open a detailed Close‐up view. The Close‐up View options (c) are used to select the zoom level and types of objects to display in the detailed view.

View Image

Figure 1.11.4 A 10‐kb region of chromosome 1 centered on the AT1G07780 locus, which is highlighted in yellow. (a) The area of the genome shown in the Close‐up view is indicated by the numbered box in the whole genome view. (b) The radio button for selecting three or all rows of data to display in the Close‐up view. (c) The gray re‐centering bar. (d) The gray bar between the T‐DNA and gene bands is used for selecting a 10‐kb region to display in the nucleotide sequence view.

View Image

Figure 1.11.5 A nucleotide sequence view centered on AT1G07810 showing annotated genes and T‐DNA/transposon insertion flanking sequences. The drop‐down menu (shown in upper right corner) was used to select the items to display in the nucleotide sequence view.

View Image

Figure 1.11.6 Overview of the GBrowse tool. (A ) The upper panel of this tool allows the user to search for landmarks in the genome such as gene names or chromosome positions, and to zoom in and out of a specific genomic region. The central panel displays a series of data tracks such as genes, cDNAs, polymorphisms, the VISTA sequence similarity track and many more. (B ) Using the track menu on the bottom, the user can choose which tracks to display by selecting either whole data categories or specific types of data.

View Image

Figure 1.11.7 (A ) Keyword search results after querying for the GO Molecular Function terms containing the word farnesyltransferase. (B ) A tree view of the term “protein farnesyltransferase activity” and associated gene annotations.

View Image

Figure 1.11.8 (A ) Results display for functional categorization of WRKY genes. The members of this family fall into 22 different GO Slim categories based on their annotations to more granular GO terms. The list can be re‐sorted by choosing Frequency from the “re‐sort by:” drop‐down menu and clicking on the “re‐sort” button. The list of 22 categories is shown grouped by keyword category. The frequency of annotations to each category is listed in the last column; the number is linked to a list of genes annotated to the terms that are children of that category. (B ) Clicking on the “create pie charts” button generates pie charts showing the distribution and frequency of annotations to each of the GO slim terms. A different pie chart is created for each aspect of the GO ontologies.

View Image

Figure 1.11.9 Subsections of a detail page for a microarray experiment. The tabs are used to navigate to different subsets of information about the experiment. (A ) The experiment summary page, which is linked from the experiment search results. (B ) The Slides & Datasets subsection for the experiment. The download button (arrow) links to an automatic download of tables containing the experiment summary along with raw and normalized values for each slide in the set.

View Image

Figure 1.11.10 Web interface for searching gene expression data from microarray experiments in TAIR.

View Image

Figure 1.11.11 Sample result set for the Microarray Expression search using AT1G55020 locus. The drop‐down menu (arrow) and resort button are used to change the order of the results display.

View Image

Figure 1.11.12 Motif Finder tool. (A ) Users can type in or upload a list of genes and select the promoter length to be analyzed. (B ) The resulting motifs are listed with the corresponding genes in which they are overrepresented.

View Image

Figure 1.11.13 AraCyc pathway details for galactose degradation I. A pathway schematic shows the reactions (blue), compounds (red), enzymes (yellow), and genes (purple). At the highest zoom level (inset box) compound structures are displayed. Each enzyme linked to a reaction based on experimental evidence is presented in bold‐face type. A pathway evidence icon appears in the upper right corner of the diagram. Additional links, a pathway summary, and references are available below the pathway schematic.

View Image

Figure 1.11.14 AraCyc OMICS Viewer Pathway Overview. , step 11 provides details about (a) the overview diagram, (b) the color key, the basic data statistics (not shown), and (d) the histogram. Pop‐up windows displaying more detailed information and links to AraCyc pathway and compound pages can be opened by clicking on any compound shown on the overview diagram. A pop‐up window of the ethylene biosynthesis pathway showing data values for specific isozymes is superimposed on the overview diagram.

View Image

Figure 1.11.15 Four buttons control the display of an OMICS viewer animation. The buttons can be used to stop and restart the animation, and step through the individual “slides.”

View Image

Videos

Literature Cited

Literature Cited
	Altschul, S., Gish, W., Miller, W., Myers, E., and Lipman, D. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403‐410.
	Berardini, T.Z., Mundodi, S., Reiser, L., Huala, E., Garcia‐Hernandez, M., Zhang, P., Mueller, L.A., Yoon, J., Doyle, A., Lander, G., Moseyko, N., Yoo, D., Xu, I., Zoeckler, B., Montoya, M., Miller, N., Weems, D., and Rhee, S.Y. 2004. Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135:745‐755.
	Birney, E., Andrews, D., Bevan, P., Caccamo, M., Cameron, G., Chen, Y., Clarke, L., Coates, G., Cox, T., Cuff, J., Curwen, V., Cutts, T., Down, T., Durbin, R., Eyras, E., Fernandez‐Suarez, X.M., Gane, P., Gibbins, B., Gilbert, J., Hammond, M., Hotz, H., Iyer, V., Kahari, A., Jekosch, K., Kasprzyk, A., Keefe, D., Keenan, S., Lehvaslaiho, H., McVicker, G., Melsopp, C., Meidl, P., Mongin, E., Pettett, R., Potter, S., Proctor, G., Rae, M., Searle, S., Slater, G., Smedley, D., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Storey, R., Ureta‐Vidal, A., Woodwark, C., Clamp, M., and Hubbard, T. 2004. Ensembl 2004. Nucleic Acids Res. 32:D468‐D470.
	Borevitz, J.O. and Nordborg, M. 2003. The impact of genomics on the study of natural variation in Arabidopsis. Plant Physiol. 132:718‐725.
	Clark, R.M., Schweikert, G., Toomajian, C., Ossowski, S., Zeller, G., Shinn, P., Warthmann, N., Hu, T.T., Fu, G., Hinds, D.A., Chen, H., Frazer, K.A., Huson, D.H., Schölkopf, B., Nordborg, M., Rätsch, G., Ecker, J.R., and Weigel, D. 2007. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317:338‐342.
	Craigon, D.J., James, N., Okyere, J., Higgins, J., Jotham, J., and May, S. 2004. NASCArrays: A repository for microarray data generated by NASC's transcriptomics service. Nucleic Acids Res. 32:D575‐D577.
	Cutler, S., Ghassemian, M., Bonetta, D., Cooney, S., and McCourt, P. 1996. A protein farnesyl transferase involved in abscisic acid signal transduction in Arabidopsis. Science 273:1239‐1241.
	Dahlquist, K.D., Salomonis, N., Vranizan, K., Lawlor, S.C., and Conklin, B.R. 2002. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet. 31:19‐20.
	Dudoit, S., Yang, Y.H., Luu, P., Lin, D.M., Peng, V., Ngai, J., and Speed, T.P. 2002. Normalization for cDNA microarray data: A robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30:E15.
	Eulgem, T., Rushton, P.J., Robatzek, S., and Somssich, I.E.. 2000. The WRKY superfamily of plant transcription factors. Trends Plant Sci. 5:199‐206.
	Field, B. and Osbourn, A.E. 2008. Metabolic diversification: Independent assembly of operon‐like gene clusters in different plants. Science 320:543‐547.
	Flanders, D.J., Weng, S., Petel, F.X., and Cherry, J.M. 1998. AtDB, the Arabidopsis thaliana database, and graphical‐web‐display of progress by the Arabidopsis Genome Initiative. Nucleic Acids Res. 26:80‐84.
	Garcia‐Hernandez, M., Berardini, T.Z., Chen, G., Crist, D., Doyle, A., Huala, E., Knee, E., Lambrecht, M., Miller, N., Mueller, L.A., Mundodi, S., Reiser, L., Rhee, S.Y., Scholl, R., Tacklind, J., Weems, D.C., Wu, Y., Xu, I., Yoo, D., Yoon, J., and Zhang, P. 2002. TAIR: A resource for integrated Arabidopsis data. Funct. Integr. Genomics 2:239‐253.
	The Gene Ontology Consortium. 2010. The gene ontology in 2010: Extensions and refinements. Nucleic Acids Res. In press.
	Haas, B.J., Delcher, A.L., Mount, S.M., Wortman, J.R., Smith, R.K. Jr., Hannick, L.I., Maiti, R., Ronning, C.M., Rusch, D.B., Town, C.D., Salzberg, S.L., and White, O. 2003. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31:5654‐5666.
	Hagen, G. and Guilfoyle, T. 2002. Auxin‐responsive gene expression: Genes, promoters and regulatory factors. Plant Mol. Biol. 49:373‐385.
	Huala, E., Dickerman, A.W., Garcia‐Hernandez, M., Weems, D., Reiser, L., LaFond, F., Hanley, D., Kiphart, D., Zhuang, M., Huang, W., Mueller, L.A., Bhattacharyya, D., Bhaya, D., Sobral, B.W., Beavis, W., Meinke, D.W., Town, C.D., Somerville, C., and Rhee, S.Y. 2001. The Arabidopsis Information Resource (TAIR): A comprehensive database and web‐based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 29:102‐105.
	Karp, P.D., Paley, S., and Romero, P. 2002. The pathway tools software. Bioinformatics 18:S225‐S232.
	Mueller, L.A., Zhang, P., and Rhee, S.Y. 2003. AraCyc: A biochemical pathway database for Arabidopsis. Plant Physiol. 132:453‐460.
	Pearson, W.R. 1995. Comparison of methods for searching protein sequence databases. Protein Sci. 4:1145‐1160.
	Rhee, S.Y. 2004. Carpe diem: Retooling the publish or perish model into the share and survive model. Plant Physiol. 134:543‐547.
	Rhee, S.Y., Weng, S., Bongard‐Pierce, D.K., Garcia‐Hernandez, M., Malekian, A., Flanders, D.J., and Cherry, J.M. 1999. Unified display of Arabidopsis thaliana physical maps from AtDB, the A.thaliana database. Nucleic Acids Res. 27:79‐84.
	Rhee, S.Y., Beavis, W., Berardini, T.Z., Chen, G., Dixon, D., Doyle, A., Garcia‐Hernandez, M., Huala, E., Lander, G., Montoya, M., Miller, N., Mueller, L.A., Mundodi, S., Reiser, L., Tacklind, J., Weems, D.C., Wu, Y., Xu, I., Yoo, D., Yoon, J., and Zhang, P. 2003. The Arabidopsis Information Resource (TAIR): A model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. 31:224‐228.
	Running, M.P., Lavy, M., Sternberg, H., Galichet, A., Gruissem, W., Hake, S., Ori, N., and Yalovsky, S. 2004. Enlarged meristems and delayed growth in plp mutants result from lack of CaaX prenyltransferases. Proc. Natl. Acad. Sci. U.S.A. 101:7815‐7820.
	Salomonis, K., Hanspers, A.C., Zambon, K., Vranizan, S.C., Lawlor, K.D., Dahlquist, S.W., Doniger, J., Stuart, B.R., and Pico, A.R. 2007. GenMAPP 2: New features and resources for pathway analysis. BMC Bioinformatics 8:217.
	Scholl, R.L., May, S.T., and Ware, D.H. 2000. Seed and molecular resources for Arabidopsis. Plant Physiol. 124:1477‐1480.
	Scholl, R., Sachs, M.M., and Ware, D. 2003. Maintaining collections of mutants for plant functional genomics. Methods Mol. Biol. 236:311‐326.
	Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The generic genome browser: A building block for a model organism system database. Genome Res. 12:1599‐1610.
	Swarbreck, D., Wilks, C., Lamesch, P., Berardini, T.Z., Garcia‐Hernandez, M., Foerster, H., Li, D., Meyer, T., Muller, R., Ploetz, L., Radenbaugh, A., Singh, S., Swing, V., Tissier, C., Zhang, P., and Huala, E. 2008. The Arabidopsis Information Resource (TAIR): Gene structure and function annotation. Nucleic Acids Res. 36:D1009‐D1014.
	Thimm, O., Blasing, O., Gibon, Y., Nagel, A., Meyer, S., Kruger, P., Selbig, J., Muller, L.A., Rhee, S.Y., and Stitt, M. 2004. MAPMAN: A user‐driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J. 37:914‐939.
	Weems, D., Miller, N., Garcia‐Hernandez, M., Huala, E., and Rhee, S.Y. 2004. Design, implementation, and maintenance of a model organism database for Arabidopsis thaliana. Comp. Funct. Genomics 5:362‐369.
	Wortman, J.R., Haas, B.J., Hannick, L.I., Smith, R.K. Jr., Maiti, R., Ronning, C.M., Chan, A.P., Yu, C., Ayele, M., Whitelaw, C.A., White, O.R., and Town, C.D. 2003. Annotation of the Arabidopsis genome. Plant Physiol. 132:461‐468.
	Yalovsky, S., Kulukian, A., Rodriguez‐Concepcion, M., Young, C.A., and Gruissem, W. 2000. Functional requirement of plant farnesyltransferase during development in Arabidopsis. Plant Cell 12:1267‐1278.
	Yamada, K., Lim, J., Dale, J.M., Chen, H., Shinn, P., Palm, C.J., Southwick, A.M., Wu, H.C., Kim, C., Nguyen, M., Pham, P., Cheuk, R., Karlin‐Newmann, G., Liu, S.X., Lam, B., Sakano, H., Wu, T., Yu, G., Miranda, M., Quach, H.L., Tripp, M., Chang, C.H., Lee, J.M., Toriumi, M., Chan, M.M., Tang, C.C., Onodera, C.S., Deng, J.M., Akiyama, K., Ansari, Y., Arakawa, T., Banh, J., Banno, F., Bowser, L., Brooks, S., Carninci, P., Chao, Q., Choy, N., Enju, A., Goldsmith, A.D., Gurjal, M., Hansen, N.F., Hayashizaki, Y., Johnson‐Hopson, C., Hsuan, V.W., Iida, K., Karnes, M., Khan, S., Koesema, E., Ishida, J., Jiang, P.X., Jones, T., Kawai, J., Kamiya, A., Meyers, C., Nakajima, M., Narusaka, M., Seki, M., Sakurai, T., Satou, M., Tamse, R., Vaysberg, M., Wallender, E.K., Wong, C., Yamamura, Y., Yuan, S., Shinozaki, K., Davis, R.W., Theologis, A., and Ecker, J.R. 2003. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science 302:842‐846.
	Zhang, P., Foerster, H., Tissier, C., Mueller, L., Paley, S., Karp, P., and Rhee, S.Y. 2005. MetaCyc and AraCyc: Metabolic pathway databases for plant research. Plant Physiology 138:27‐37.
	Ziegelhoffer, E.C., Medrano, L.J., and Meyerowitz, E.M. 2000. Cloning of the Arabidopsis WIGGUM gene identifies a role for farnesylation in meristem development. Proc. Natl. Acad. Sci. U.S.A. 97:7633‐7638.
	Zimmermann, P., Hirsch‐Hoffmann, M., Hennig, L., and Gruissem, W. 2004. GENEVESTIGATOR: Arabidopsis microarray database and analysis toolbox. Plant Physiol. 136:2621‐2632.