Searching for Non‐B DNA‐Forming Motifs Using nBMST (Non‐B DNA Motif Search Tool)
互联网
- Abstract
- Table of Contents
- Materials
- Figures
- Literature Cited
Abstract
This unit describes basic protocols on using the non?B DNA Motif Search Tool (nBMST) to search for sequence motifs predicted to form alternative DNA conformations that differ from the canonical right?handed Watson?Crick double?helix, collectively known as non?B DNA, and on using the associated PolyBrowse, a GBrowse?based genomic browser. The nBMST is a Web?based resource that allows users to submit one or more DNA sequences to search for inverted repeats (cruciform DNA), mirror repeats (triplex DNA), direct/tandem repeats (slipped/hairpin structures), G4 motifs (tetraplex, G?quadruplex DNA), alternating purine?pyrimidine tracts (left?handed Z?DNA), and A?phased repeats (static bending). The nBMST is versatile, simple to use, does not require bioinformatics skills, and can be applied to any type of DNA sequences, including viral and bacterial genomes, up to an aggregate of 20 megabasepairs (Mbp). Curr. Protoc. Hum. Genet. 73:18.7.1?18.7.22. © 2012 by John Wiley & Sons, Inc.
Keywords: nBMST; non?B DNA; nucleotide sequence analysis; G?quadruplex; triplex; cruciform; Z?DNA; hairpin; slipped DNA; alternative DNA structure; tandem repeats; PolyBrowse
Table of Contents
- Introduction
- Basic Protocol 1: Using the nBMST Server
- Basic Protocol 2: Using the PolyBrowse Viewer
- Commentary
- Literature Cited
- Figures
- Tables
Materials
Basic Protocol 1: Using the nBMST Server
Materials
Basic Protocol 2: Using the PolyBrowse Viewer
Materials
|
Figures
-
Figure 18.7.1 Non‐B DNA‐forming motif search criteria. View Image -
Figure 18.7.2 The nBMST submission page. The five steps involved in the submission process are shown. An e‐mail address is entered and all the non‐B DNA motifs are selected (grayed area). A FASTA sequence, NC_007530.2 Bacillus anthracis str. ‘Ames Ancestor’, in this example is uploaded. The captcha characters ( 4AkA4 in this instance) are entered since the user did not log in. View Image -
Figure 18.7.3 The nBMST results page. The upper section includes the major statistics of the nBMST run and the lower section displays expandable individual results for each motif type. View Image -
Figure 18.7.4 Static visualization of the direct repeats as a PNG image. The PNG image may be saved either by right‐clicking on it or by clicking on “Download all files for this motif” on the upper right. View Image -
Figure 18.7.5 Dynamic visualization of the direct repeats on PolyBrowse page. The PolyBrowse page is created uniquely for each nBMST job submitted and is visible only to the user who submitted the job. In this example, job ID 8c32923d93, a 5‐Mbp region of the Bacillus anthracis results, is displayed. View Image -
Figure 18.7.6 A smaller region, 6.001 kbp, of Figure is zoomed in, and all the non‐B DNA motif tracks are turned on. The ‐/+ sign (left of the motif tracks) may be used to toggle between hiding and showing the results. View Image -
Figure 18.7.7 Dynamic visualization of multiple FASTA files. When multiple FASTA sequences are submitted, PolyBrowse displays multiple sub‐links (red box), each representing a specific sequence. Clicking on each sub‐link displays the non‐B DNA motifs found for that specific sequence. View Image -
Figure 18.7.8 PolyBrowse general page layout displaying File and Help options, the Landmark or Region where coordinates of around human c‐Myc gene is entered, and Human v37 selected as Data Source . Also shown are Gene tracks including Refseq Genes, mRNAs , and Promoters; Non‐B DNA motif track G‐Quadruplex Forming Repeat , and Polymorphism track dbSNP. Note that out of the three G‐quadruplex repeat motifs (shown in blue glyph), two are predicted within the promoter, MYC_Prom (orange glyph), and one within the first exon of mRNA, NM_002467.3 (gray glyph) of the MYC gene. One dbSNP entry, rs13250910 (red glyph), falls within one G‐quadruplex motif. View Image -
Figure 18.7.9 PolyBrowse page showing Non‐B DNA motifs section where annotation track for G‐Quadruplex Forming Repeat is turned on. Another track turned on is 1k LiftOver Blocks from the Synteny section. View Image -
Figure 18.7.10 PolyBrowse page showing “ trace GPlexes clusters ” tracks where trace refers to Sanger trace reads which have been mapped against the human genome reference (version 37.1). View Image -
Figure 18.7.11 PolyBrowse. 1k LiftOver blocks available for cross‐species comparison. Note that when chr8_128748001_LO1k is hovered upon, the syntenic regions for other species are displayed. Clicking on a new species takes the user to the selected species. In the example in Figure , we clicked on Go to Chimp. View Image -
Figure 18.7.12 PolyBrowse page showing the MYC gene in Chimp v2 as Data Source . Note that three G‐quadruplex forming repeat motifs are all conserved in the chimpanzee genome sequence. Similar to human, out of the three G‐quadruplex repeat motifs (shown in blue glyph), two are predicted within the promoter, MYC_Prom (orange glyph) and one within the first exon of mRNA, XM_519958.2 (gray glyph) of the MYC gene. View Image -
Figure 18.7.13 The 16 and 17 bp G4 DNA‐forming structures reported in Cahoon and Seifert () are captured by nBMST. (A ) Details of the G4 motifs in the Neisseria gonorrhoeae genome NCCP11945. (B ) Details of the G4 motifs in the Neisseria gonorrhoeae genome FA 1090. (C ) The genomic region of NCCP11945 containing both the 16 and 17 bp G4 DNA‐forming repeats as seen in PolyBrowse. (D ) The details of the 17 bp G4 DNA‐forming repeat is obtained by clicking on the green track in (C). View Image -
Figure 18.7.14 Non‐B DNA‐forming motifs detected by nBMST in the study by Lawson et al. (). (A ) Nucleotide sequence of the 67‐bp de novo insertion observed in patient PA27. (B ) PolyBrowse view of the nBMST results for the 220 bp sequence from patient PA27 containing the 67 bp de novo insertion which spans from 62 to 128 nt. Most of the motifs occurred near or within the 67‐bp de novo insertion. View Image
Videos
Literature Cited
Adachi, M. and Tsujimoto, Y. 1990. Potential Z‐DNA elements surround the breakpoints of chromosome translocation within the 5′ flanking region of bcl‐2 gene. Oncogene 5:1653‐1657. | |
Akagi, K., Li, J., Stephens, R.M., Volfovsky, N., and Symer, D.E. 2008. Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition. Genome Res. 18:869‐880. | |
Akagi, K., Stephens, R.M., Li, J., Evdokimov, E., Kuehn, M.R., Volfovsky, N., and Symer, D.E. 2009. MouseIndelDB: A database integrating genomic indel polymorphisms that distinguish mouse strains. Nucleic Acids Res. 38:D600‐D606. | |
Bacolla, A. and Wells, R.D. 2004. Non‐B DNA conformations, genomic rearrangements, and human disease. J. Biol. Chem. 279:47411‐47414. | |
Bacolla, A. and Wells, R.D. 2009. Non‐B DNA conformations as determinants of mutagenesis and human disease. Mol. Carcinog. 48:273‐285. | |
Boehm, T., Mengle‐Gaw, L., Kees, U.R., Spurr, N., Lavenir, I., Forster, A., and Rabbitts, T.H. 1989. Alternating purine‐pyrimidine tracts may promote chromosomal translocations seen in a variety of human lymphoid tumours. EMBO J. 8:2621‐2631. | |
Brouwer, J.R., Willemsen, R., and Oostra, B.A. 2009. Microsatellite repeat instability and neurological disease. Bioessays 31:71‐83. | |
Cahoon, L.A. and Seifert, H.S. 2009. An alternative DNA structure is necessary for pilin antigenic variation in Neisseria gonorrhoeae. Science 325:764‐767. | |
Carvalho, C.M., Ramocki, M.B., Pehlivan, D., Franco, L.M., Gonzaga‐Jauregui, C., Fang, P., McCall, A., Pivnick, E.K., Hines‐Dowell, S., Seaver, L.H., Friehling, L., Lee, S., Smith, R., Del Gaudio, D., Withers, M., Liu, P., Cheung, S.W., Belmont, J.W., Zoghbi, H.Y., Hastings, P.J., and Lupski, J.R. 2011. Inverted genomic segments and complex triplication rearrangements are mediated by inverted repeats in the human genome. Nat. Genet. 43:1074‐1081. | |
Cer, R.Z., Bruce, K.H., Mudunuri, U.S., Yi, M., Volfovsky, N., Luke, B.T., Bacolla, A., Collins, J.R., and Stephens, R.M. 2011. Non‐B DB: A database of predicted non‐B DNA‐forming motifs in mammalian genomes. Nucleic Acids Res. 39:D383‐D391. | |
Cooper, D.N., Bacolla, A., Ferec, C., Vasquez, K.M., Kehrer‐Sawatzki, H., and Chen, J.M. 2011. On the sequence‐directed nature of human gene mutation: The role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum. Mutat. 32:1075‐1099. | |
Dragileva, E., Hendricks, A., Teed, A., Gillis, T., Lopez, E.T., Friedberg, E.C., Kucherlapati, R., Edelmann, W., Lunetta, K.L., MacDonald, M.E., and Wheeler, V.C. 2009. Intergenerational and striatal CAG repeat instability in Huntington's disease knock‐in mice involve different DNA repair genes. Neurobiol. Dis. 33:37‐47. | |
Emanuel, B.S. 2008. Molecular mechanisms and diagnosis of chromosome 22q11.2 rearrangements. Dev. Disabil. Res. Rev. 14:11‐18. | |
Entezam, A. and Usdin, K. 2008. ATR protects the genome against CGG.CCG‐repeat expansion in Fragile X premutation mice. Nucleic Acids Res. 36:1050‐1056. | |
Fleming, K., Riser, D.K., Kumari, D., and Usdin, K. 2003. Instability of the fragile X syndrome repeat in mice: The effect of age, diet and mutations in genes that affect DNA replication, recombination and repair proficiency. Cytogenet. Genome Res. 100:140‐146. | |
Foiry, L., Dong, L., Savouret, C., Hubert, L., te Riele, H., Junien, C., and Gourdon, G. 2006. Msh3 is a limiting factor in the formation of intergenerational CTG expansions in DM1 transgenic mice. Hum. Genet. 119:520‐526. | |
Gaddis, S.S., Wu, Q., Thames, H.D., DiGiovanni, J., Walborg, E.F., MacLeod, M.C., and Vasquez, K.M. 2006. A web‐based search engine for triplex‐forming oligonucleotide target sequences. Oligonucleotides 16:196‐201. | |
Gal, M., Katz, T., Ovadia, A., and Yagil, G. 2003. TRACTS: A program to map oligopurine.oligopyrimidine and other binary DNA tracts. Nucleic Acids Res 31:3682‐3685. | |
Gomes‐Pereira, M., Fortune, M.T., Ingram, L., McAbney, J.P., and Monckton, D.G. 2004. Pms2 is a genetic enhancer of trinucleotide CAG.CTG repeat somatic mosaicism: Implications for the mechanism of triplet repeat expansion. Hum. Mol. Genet. 13:1815‐1825. | |
Inagaki, H., Ohye, T., Kogo, H., Kato, T., Bolor, H., Taniguchi, M., Shaikh, T.H., Emanuel, B.S., and Kurahashi, H. 2009. Chromosomal instability mediated by non‐B DNA: Cruciform conformation and not DNA sequence is responsible for recurrent translocation in humans. Genome Res. 19:191‐198. | |
Jenjaroenpun, P. and Kuznetsov, V.A. 2009. TTS mapping: Integrative WEB tool for analysis of triplex formation target DNA sequences, G‐quadruplets and non‐protein coding regulatory DNA elements in the human genome. BMC Genomics 10 Suppl 3:S9. | |
Kehrer‐Sawatzki, H., Haussler, J., Krone, W., Bode, H., Jenne, D.E., Mehnert, K.U., Tummers, U., and Assum, G. 1997. The second case of a t(17;22) in a family with neurofibromatosis type 1: Sequence analysis of the breakpoint regions. Hum. Genet. 99:237‐247. | |
Kikin, O., D'Antonio, L., and Bagga, P.S. 2006. QGRS Mapper: A web‐based server for predicting G‐quadruplexes in nucleotide sequences. Nucleic Acids Res. 34:W676‐W682. | |
Kostadinov, R., Malhotra, N., Viotti, M., Shine, R., D'Antonio, L., and Bagga, P. 2006. GRSDB: A database of quadruplex forming G‐rich sequences in alternatively processed mammalian pre‐mRNA sequences. Nucleic Acids Res. 34:D119‐D124. | |
Kurahashi, H., Inagaki, H., Ohye, T., Kogo, H., Kato, T., and Emanuel, B.S. 2006. Chromosomal translocations mediated by palindromic DNA. Cell Cycle 5:1297‐1303. | |
Kurahashi, H., Inagaki, H., Hosoba, E., Kato, T., Ohye, T., Kogo, H., and Emanuel, B.S. 2007. Molecular cloning of a translocation breakpoint hotspot in 22q11. Genome Res. 17:461‐469. | |
Kurahashi, H., Inagaki, H., Ohye, T., Kogo, H., Tsutsumi, M., Kato, T., Tong, M., and Emanuel, B.S. 2010. The constitutional t(11;22): Implications for a novel mechanism responsible for gross chromosomal rearrangements. Clin. Genet. 78:299‐309. | |
Lawson, A.R., Hindley, G.F., Forshew, T., Tatevossian, R.G., Jamie, G.A., Kelly, G.P., Neale, G.A., Ma, J., Jones, T.A., Ellison, D.W., and Sheer, D. 2011. RAF gene fusion breakpoints in pediatric brain tumors are characterized by significant enrichment of sequence microhomology. Genome Res. 21:505‐514. | |
Li, H., Xiao, J., Li, J., Lu, L., Feng, S., and Droge, P. 2009. Human genomic Z‐DNA segments probed by the Z alpha domain of ADAR1. Nucleic Acids Res. 37:2737‐2746. | |
Lin, Y. and Wilson, J.H. 2011. Transcription‐induced DNA toxicity at trinucleotide repeats: Double bubble is trouble. Cell Cycle 10:611‐618. | |
Lopez Castel, A., Cleary, J.D., and Pearson, C.E. 2010. Repeat instability as the basis for human diseases and as a potential target for therapy. Nat. Rev. Mol. Cell. Biol. 11:165‐170. | |
Manley, K., Shirley, T.L., Flaherty, L., and Messer, A. 1999. Msh2 deficiency prevents in vivo somatic instability of the CAG repeat in Huntington disease transgenic mice. Nat. Genet. 23:471‐473. | |
McMurray, C.T. 2010. Mechanisms of trinucleotide repeat instability during human development. Nat. Rev. Genet. 11:786‐799. | |
Messaed, C. and Rouleau, G.A. 2009. Molecular mechanisms underlying polyalanine diseases. Neurobiol. Dis. 34:397‐405. | |
Mirkin, S.M. 2007. Expandable DNA repeats and human disease. Nature 447:932‐940. | |
Orr, H.T. and Zoghbi, H.Y. 2007. Trinucleotide repeat disorders. Annu. Rev. Neurosci. 30:575‐621. | |
Pearson, C.E., Nichol Edamura, K., and Cleary, J.D. 2005. Repeat instability: Mechanisms of dynamic mutations. Nat. Rev. Genet. 6:729‐742. | |
Phan, A.T., Kuryavyi, V., and Patel, D.J. 2006. DNA architecture: From G to Z. Curr. Opin. Struct. Biol. 16:288‐298. | |
Punga, T. and Buhler, M. 2010. Long intronic GAA repeats causing Friedreich ataxia impede transcription elongation. EMBO Mol. Med. 2:120‐129. | |
Rimokh, R., Rouault, J.P., Wahbi, K., Gadoux, M., Lafage, M., Archimbaud, E., Charrin, C., Gentilhomme, O., Germain, D., Samarut, J., et al. 1991. A chromosome 12 coding region is juxtaposed to the MYC protooncogene locus in a t(8;12)(q24;q22) translocation in a case of B‐cell chronic lymphocytic leukemia. Genes Chromosomes Cancer 3:24‐36. | |
Scaria, V., Hariharan, M., Arora, A., and Maiti, S. 2006. Quadfinder: Server for identification and analysis of quadruplex‐forming motifs in nucleotide sequences. Nucleic Acids Res. 34:W683‐W685. | |
Schroth, G.P., Chou, P.J., and Ho, P.S. 1992. Mapping Z‐DNA in the human genome. Computer‐aided mapping reveals a nonrandom distribution of potential Z‐DNA‐forming sequences in human genes. J. Biol. Chem. 267:11846‐11855. | |
Seite, P., Leroux, D., Hillion, J., Monteil, M., Berger, R., Mathieu‐Mahul, D., and Larsen, C.J. 1993. Molecular analysis of a variant 18;22 translocation in a case of lymphocytic lymphoma. Genes Chromosomes Cancer 6:39‐44. | |
Shelbourne, P.F., Keller‐McGandy, C., Bi, W.L., Yoon, S.R., Dubeau, L., Veitch, N.J., Vonsattel, J.P., Wexler, N.S., Arnheim, N., and Augood, S.J. 2007. Triplet repeat mutation length gains correlate with cell‐type specific vulnerability in Huntington disease brain. Hum. Mol. Genet. 16:1133‐1142. | |
Sheridan, M.B., Kato, T., Haldeman‐Englert, C., Jalali, G.R., Milunsky, J.M., Zou, Y., Klaes, R., Gimelli, G., Gimelli, S., Gemmill, R.M., Drabkin, H.A., Hacker, A.M., Brown, J., Tomkins, D., Shaikh, T.H., Kurahashi, H., Zackai, E.H., and Emanuel, B.S. 2010. A palindrome‐mediated recurrent translocation with 3:1 meiotic nondisjunction: The t(8;22)(q24.13;q11.21). Am. J. Hum. Genet. 87:209‐218. | |
Simsek, D., Brunet, E., Wong, S.Y., Katyal, S., Gao, Y., McKinnon, P.J., Lou, J., Zhang, L., Li, J., Rebar, E.J., Gregory, P.D., Holmes, M.C., and Jasin, M. 2011. DNA ligase III promotes alternative nonhomologous end‐joining during chromosomal translocation formation. PLoS Genet. 7(6)e1002080. | |
Sinclair, P.B., Parker, H., An, Q., Rand, V., Ensor, H., Harrison, C. J., and Strefford, J.C. 2011. Analysis of a breakpoint cluster reveals insight into the mechanism of intrachromosomal amplification in a lymphoid malignancy. Hum. Mol. Genet. 20:2591‐2602. | |
Stankiewicz, P. and Lupski, J.R. 2010. Structural variation in the human genome and its role in disease. Annu. Rev. Med. 61:437‐455. | |
Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The generic genome browser: A building block for a model organism system database. Genome Res. 12:1599‐1610. | |
Thandla, S.P., Ploski, J.E., Raza‐Egilmez, S.Z., Chhalliyil, P.P., Block, A.W., de Jong, P.J., and Aplan, P.D. 1999. ETV6‐AML1 translocation breakpoints cluster near a purine/pyrimidine repeat region in the ETV6 gene. Blood 93:293‐299. | |
Tong, M., Kato, T., Yamada, K., Inagaki, H., Kogo, H., Ohye, T., Tsutsumi, M., Wang, J., Emanuel, B.S., and Kurahashi, H. 2010. Polymorphisms of the 22q11.2 breakpoint region influence the frequency of de novo constitutional t(11;22)s in sperm. Hum. Mol. Genet. 19:2630‐2637. | |
van den Broek, W.J., Nelen, M.R., Wansink, D.G., Coerwinkel, M.M., te Riele, H., Groenen, P.J., and Wieringa, B. 2002. Somatic expansion behaviour of the (CTG)n repeat in myotonic dystrophy knock‐in mice is differentially affected by Msh3 and Msh6 mismatch‐repair proteins. Hum. Mol. Genet. 11:191‐198. | |
Vandyke, D.L., Weiss, L., Roberson, J.R., and Babu, V.R. 1983. The frequency and mutation‐rate of balanced autosomal rearrangements in man estimated from prenatal genetic‐studies for advanced maternal age. Am. J. Hum. Genet. 35:301‐308. | |
Wells, R.D. 2007. Non‐B DNA conformations, mutagenesis and disease. Trends Biochem. Sci. 32:271‐278. | |
Wells, R.D. 2008. DNA triplexes and Friedreich ataxia. FASEB J. 22:1625‐1634. | |
Wells, R.D., Dere, R., Hebert, M.L., Napierala, M., and Son, L.S. 2005. Advances in mechanisms of genetic instability related to hereditary neurological diseases. Nucleic Acids Res. 33:3785‐3798. | |
Wheeler, V.C., Lebel, L.A., Vrbanac, V., Teed, A., te Riele, H., and MacDonald, M.E. 2003. Mismatch repair gene Msh2 modifies the timing of early disease in Hdh(Q111) striatum. Hum. Mol. Genet. 12:273‐281. | |
Wiemels, J.L. and Greaves, M. 1999. Structure and possible mechanisms of TEL‐AML1 gene fusions in childhood acute lymphoblastic leukemia. Cancer Res. 59:4075‐4082. | |
Yadav, V.K., Abraham, J.K., Mani, P., Kulshrestha, R., and Chowdhury, S. 2008. QuadBase: Genome‐wide database of G4 DNA: Occurrence and conservation in human, chimpanzee, mouse and rat promoters and 146 microbes. Nucleic Acids Res. 36:D381‐D385. | |
Zhang, R., Lin, Y., and Zhang, C.T. 2008. Greglist: A database listing potential G‐quadruplex regulated genes. Nucleic Acids Res. 36:D372‐D376. | |
Zhao, J., Bacolla, A., Wang, G., and Vasquez, K.M. 2010. Non‐B DNA structure‐induced genetic instability and evolution. Cell Mol. Life Sci. 67:43‐62. | |
Zu, T., Gibbens, B., Doty, N.S., Gomes‐Pereira, M., Huguet, A., Stone, M.D., Margolis, J., Peterson, M., Markowski, T.W., Ingram, M.A., Nan, Z., Forster, C., Low, W.C., Schoser, B., Somia, N.V., Clark, H.B., Schmechel, S., Bitterman, P.B., Gourdon, G., Swanson, M.S., Moseley, M., Ranum, L.P. 2011. Non‐ATG‐initiated translation directed by microsatellite expansions. Proc. Natl. Acad. Sci. U.S.A. 108:260‐265. | |
Internet Resources | |
http://nonb.abcc.ncifcrf.gov | |
Non‐B DB, a database resource for integrated annotations and analysis of non‐B DNA‐forming motifs. | |
http://pbrowse3.abcc.ncifcrf.gov/cgi‐bin/gb2/gbrowse/Human_37/ | |
PolyBrowse, ABCC genome browser for variations and annotations. | |
http://tandem.bu.edu/trf/trf.submit.options.html | |
Tandem Repeats Finder. | |
http://miracle.igib.res.in/quadfinder/crux.html | |
QuadFinder to find cruciform DNA. | |
http://quadbase.igib.res.in/ | |
QuadBase, a database of quadruplex motifs. | |
http://tubic.tju.edu.cn/greglist/ | |
Greglist, a database of G‐quadruplex regulated genes. | |
http://bioinformatics.ramapo.edu/GRSDB2/ | |
GRSDB, a database of G‐Rich sequences. | |
http://bioinformatics.ramapo.edu/QGRS/index.php | |
Quadruplex forming G‐Rich Sequences (QGRS) Mapper. | |
http://tandem.bu.edu/irf/irf.download.html | |
Inverted Repeat Finder, a command line version of the IRF algorithm used to investigate inverted repeat structure of the human genome. | |
http://ggeda.bii.a‐star.edu.sg/∼piroonj/TTS_mapping/TTS_mapping.php | |
Triplex Target DNA Site (TTS) Mapping. | |
http://spi.mdanderson.org/tfo/ | |
Triplex‐Forming Oligonucleotide Target Sequence Search program. | |
http://bioportal.weizmann.ac.il/tracts/tracts.html | |
The Tracts program to detect and analyze binary tracts in a DNA sequence. | |
http://gac‐web.cgrb.oregonstate.edu/zDNA/ | |
Z‐Hunt tool to find Z‐DNA. |