A Survey of Copy‐Number Variation Detection Tools Based on High‐Throughput Sequencing Data

互联网2013-12-31

1407

Abstract
Table of Contents
Figures
Literature Cited

Abstract

Copy?number variation (CNV) is a major class of genomic variation with potentially important functional consequences in both normal and diseased populations. Remarkable advances in development of next?generation sequencing (NGS) platforms provide an unprecedented opportunity for accurate, high?resolution characterization of CNVs. In this unit, we give an overview of available computational tools for detection of CNVs and discuss comparative advantages and disadvantages of different approaches. Curr. Protoc. Hum. Genet. 75:7.19.1?7.19.15. © 2012 by John Wiley & Sons, Inc.

Keywords: structural variation; insertion; deletion; indel; inversion; translocation

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Overview of CNV Detection Approaches Based on NGS Data
Discussion
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 7.19.1 Four basic CNV/SV detection strategies: (1) Read‐depth methods use read‐density to detect CNVs. (2) Paired‐end mapping (PEM) methods detect CNV/SV by analyzing configurations of PEMs. (3) Split read methods separately map the two ends of a read first to identify small insertion/deletion. Split read can also be used to pinpoint the exact location of breakpoints. (4) Assembly methods identity CNV/SV by assembling the short reads to chunks of contigs.

View Image

Figure 7.19.2 Read‐depth methods. This approach detects CNVs by investigating the read‐densities in genomic regions. Read‐depth methods can be used to detect both germline and somatic CNVs.

View Image

Figure 7.19.3 Configurations of discordant PEMs. (A ) A fragment covering the breakpoint of a deletion is sequenced. The distance between the two ends of a read pair is significantly larger than what is expected, and hence indicates a deletion. (B ) A small insertion is indicated by a read pair whose mapping distance is significantly less than what is expected. (C ) A large fragment (larger than the insert size) in chromosome A is inserted into another position at chromosome A. This event will generate two classes of discordant PEMs. One class of discordant PEMs, similar to PEMs from deletions, will have mapped distances between two ends significantly larger than what is expected (the left pair on the plot). Another class of discordant PEMs will have the order of two ends switched (the right pair on the plot). (D ) A read‐pair mapped to the same strand signals an inversion. (E ) After alignment of the read‐pair shown, the mapped positions of the two ends are switched. This can be due to a tandem duplication or a large insertion (see C in this figure). (F ) An inter‐chromosomal can be indicated by a read‐pair whose two ends are mapped to different chromosomes.

View Image

Figure 7.19.4 Split read mapping. Split read can be used to identify small insertions (A ) or deletions (B ).

View Image

Figure 7.19.5 Assembly‐based methods. In principle, assembly methods can be used to detect any genomic variations. The general strategy is to find the overlap between the short reads and assemble the short reads to contigs. In this simplified example, a copy number gain (S1 and S2 in the plot) can be identified by an assembly method.

View Image

Videos

Literature Cited

Literature Cited
	Abel, H.J., Duncavage, E.J., Becker, N., Armstrong, J.R., Magrini, V.J., and Pfeifer, J.D. 2010. SLOPE: A quick and accurate method for locating non‐SNP structural variation from targeted next‐generation sequence data. Bioinformatics 26:2684‐2688.
	Abyzov, A., Urban, A.E., Snyder, M., and Gerstein, M. 2011. CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21:974‐984.
	Alkan, C., Sajjadian, S., and Eichler, E.E. 2011. Limitations of next‐generation genome sequence assembly. Nat. Methods 8:61‐65.
	Barrett, J.C., Hansoul, S., Nicolae, D.L., Cho, J.H., Duerr, R.H., Rioux, J.D., Brant, S.R., Silverberg, M.S., Taylor, K.D., Barmada, M.M., Bitton, A., Dassopoulos, T., Datta, L.W., Green, T., Griffiths, A.M., Kistner, E.O., Murtha, M.T., Regueiro, M.D., Rotter, J.I., Schumm, L.P., Steinhart, A.H., Targan, S.R., Xavier, R.J., Libioulle, C., Sandor, C., Lathrop, M., Belaiche, J., Dewit, O., Gut, I., Heath, S., Laukens, D., Mni, M., Rutgeerts, P., Van, G.A., Zelenika, D., Franchimont, D., Hugot, J.P., de, V.M., Vermeire, S., Louis, E., Cardon, L.R., Anderson, C.A., Drummond, H., Nimmo, E., Ahmad, T., Prescott, N.J., Onnie, C.M., Fisher, S.A., Marchini, J., Ghori, J., Bumpstead, S., Gwilliam, R., Tremelling, M., Deloukas, P., Mansfield, J., Jewell, D., Satsangi, J., Mathew, C.G., Parkes, M., Georges, M., and Daly, M.J. 2008. Genome‐wide association defines more than 30 distinct susceptibility loci for Crohn's Disease. Nat. Genet. 40:955‐962.
	Barrett, J.C., Clayton, D.G., Concannon, P., Akolkar, B., Cooper, J.D., Erlich, H.A., Julier, C., Morahan, G., Nerup, J., Nierras, C., Plagnol, V., Pociot, F., Schuilenburg, H., Smyth, D.J., Stevens, H., Todd, J.A., Walker, N.M. and Rich, S.S. 2009. Genome‐wide association study and meta‐analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41:703‐707.
	Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., Boutell, J.M., Bryant, J., Carter, R.J., Keira, C.R., Cox, A.J., Ellis, D.J., Flatbush, M.R., Gormley, N.A., Humphray, S.J., Irving, L.J., Karbelashvili, M.S., Kirk, S.M., Li, H., Liu, X., Maisinger, K.S., Murray, L.J., Obradovic, B., Ost, T., Parkinson, M.L., Pratt, M.R., Rasolonjatovo, I.M., Reed, M.T., Rigatti, R., Rodighiero, C., Ross, M.T., Sabot, A., Sankar, S.V., Scally, A., Schroth, G.P., Smith, M.E., Smith, V.P., Spiridou, A., Torrance, P.E., Tzonev, S.S., Vermaas, E.H., Walter, K., Wu, X., Zhang, L., Alam, M.D., Anastasi, C., Aniebo, I.C., Bailey, D.M., Bancarz, I.R., Banerjee, S., Barbour, S.G., Baybayan, P.A., Benoit, V.A., Benson, K.F., Bevis, C., Black, P.J., Boodhun, A., Brennan, J.S., Bridgham, J.A., Brown, R.C., Brown, A.A., Buermann, D.H., Bundu, A.A., Burrows, J.C., Carter, N.P., Castillo, N., Chiara, E.C., Chang, S., Neil, C.R., Crake, N.R., Dada, O.O., Diakoumakos, K.D., Dominguez‐Fernandez, B., Earnshaw, D.J., Egbujor, U.C., Elmore, D.W., Etchin, S.S., Ewan, M.R., Fedurco, M., Fraser, L.J., Fuentes, Fajardo, K.V., Scott, F.W., George, D., Gietzen, K.J., Goddard, C.P., Golda, G.S., Granieri, P.A., Green, D.E., Gustafson, D.L., Hansen, N.F., Harnish, K., Haudenschild, C.D., Heyer, N.I., Hims, M.M., Ho, J.T., Horgan, A.M., Hoschler, K., Hurwitz, S., Ivanov, D.V., Johnson, M.Q., James, T., Huw Jones, T.A., Kang, G.D., Kerelska, T.H., Kersey, A.D., Khrebtukova, I., Kindwall, A.P., Kingsbury, Z., Kokko‐Gonzales, P.I., Kumar, A., Laurent, M.A., Lawley, C.T., Lee, S.E., Lee, X., Liao, A.K., Loch, J.A., Lok, M., Luo, S., Mammen, R.M., Martin, J.W., McCauley, P.G., McNitt, P., Mehta, P., Moon, K.W., Mullens, J.W., Newington, T., Ning, Z., Ling, N.B., Novo, S.M., O'Neill, M.J., Osborne, M.A., Osnowski, A., Ostadan, O., Paraschos, L.L., Pickering, L., Pike, A.C., Pike, A.C., Chris, P.D., Pliskin, D.P., Podhasky, J., Quijano, V.J., Raczy, C., Rae, V.H., Rawlings, S.R., Chiva, R.A., Roe, P.M., Rogers, J., Rogert Bacigalupo, M.C., Romanov, N., Romieu, A., Roth, R.K., Rourke, N.J., Ruediger, S.T., Rusman, E., Sanches‐Kuiper, R.M., Schenker, M.R., Seoane, J.M., Shaw, R.J., Shiver, M.K., Short, S.W., Sizto, N.L., Sluis, J.P., Smith, M.A., Ernest Sohna, S.J., Spence, E.J., Stevens, K., Sutton, N., Szajkowski, L., Tregidgo, C.L., Turcatti, G., Vandevondele, S., Verhovsky, Y., Virk, S.M., Wakelin, S., Walcott, G.C., Wang, J., Worsley, G.J., Yan, J., Yau, L., Zuerlein, M., Rogers, J., Mullikin, J.C., Hurles, M.E., McCooke, N.J., West, J.S., Oaks, F.L., Lundberg, P.L., Klenerman, D., Durbin, R., and Smith, A.J. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53‐59.
	Bochukova, E.G., Huang, N., Keogh, J., Henning, E., Purmann, C., Blaszczyk, K., Saeed, S., Hamilton‐Shield, J., Clayton‐Smith, J., O'Rahilly, S., Hurles, M.E., and Farooqi, I.S. 2010. Large, rare chromosomal deletions associated with severe early‐onset obesity. Nature 463:666‐670.
	Boeva, V., Zinovyev, A., Bleakley, K., Vert, J.P., Janoueix‐Lerosey, I., Delattre, O., and Barillot, E. 2011. Control‐free calling of copy number alterations in deep‐sequencing data using GC‐content normalization. Bioinformatics 27:268‐269.
	Cancer Genome Atlas Research Network. 2008. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061‐1068.
	Chaisson, M.J., Brinza, D., and Pevzner, P.A. 2009. De novo fragment assembly with short mate‐paired reads: Does the read length matter? Genome Res. 19:336‐346.
	Chen, K., Wallis, J.W., McLellan, M.D., Larson, D.E., Kalicki, J.M., Pohl, C.S., McGrath, S.D., Wendl, M.C., Zhang, Q., Locke, D.P., Shi, X., Fulton, R.S., Ley, T.J., Wilson, R.K., Ding, L., and Mardis, E.R. 2009. BreakDancer: An algorithm for high‐resolution mapping of genomic structural variation. Nat. Methods 6:677‐681.
	Chiang, D.Y., Getz, G., Jaffe, D.B., O'Kelly, M.J., Zhao, X., Carter, S.L., Russ, C., Nusbaum, C., Meyerson, M., and Lander, E.S. 2009. High‐resolution mapping of copy‐number alterations with massively parallel sequencing. Nat. Methods 6:99‐103.
	Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., Andrews, T.D., Barnes, C., Campbell, P., Fitzgerald, T., Hu, M., Ihm, C.H., Kristiansson, K., Macarthur, D.G., Macdonald, J.R., Onyiah, I., Pang, A.W., Robson, S., Stirrups, K., Valsesia, A., Walter, K., Wei, J., Tyler‐Smith, C., Carter, N.P., Lee, C., Scherer, S.W., and Hurles, M.E. 2010. Origins and functional impact of copy number variation in the human genome. Nature 464:704‐712.
	Deng, X. 2011. SeqGene: A comprehensive software solution for mining exome‐ and transcriptome‐sequencing data. BMC Bioinformatics 12:267.
	Diskin, S.J., Hou, C., Glessner, J.T., Attiyeh, E.F., Laudenslager, M., Bosse, K., Cole, K., Mosse, Y.P., Wood, A., Lynch, J.E., Pecor, K., Diamond, M., Winter, C., Wang, K., Kim, C., Geiger, E.A., McGrady, P.W., Blakemore, A.I., London, W.B., Shaikh, T.H., Bradfield, J., Grant, S.F., Li, H., Devoto, M., Rappaport, E.R., Hakonarson, H., and Maris, J.M. 2009. Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459:987‐991.
	Fanciulli, M., Norsworthy, P.J., Petretto, E., Dong, R., Harper, L., Kamesh, L., Heward, J.M., Gough, S.C., de, S.A., Blakemore, A.I., Froguel, P., Owen, C.J., Pearce, S.H., Teixeira, L., Guillevin, L., Graham, D.S., Pusey, C.D., Cook, H.T., Vyse, T.J., and Aitman, T.J. 2007. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ‐specific, autoimmunity. Nat. Genet. 39:721‐723.
	Fiegler, H., Redon, R., Andrews, D., Scott, C., Andrews, R., Carder, C., Clark, R., Dovey, O., Ellis, P., Feuk, L., French, L., Hunt, P., Kalaitzopoulos, D., Larkin, J., Montgomery, L., Perry, G.H., Plumb, B.W., Porter, K., Rigby, R.E., Rigler, D., Valsesia, A., Langford, C., Humphray, S.J., Scherer, S.W., Lee, C., Hurles, M.E., and Carter, N.P. 2006. Accurate and reliable high‐throughput detection of copy number variation in the human genome. Genome Res. 16:1566‐1574.
	Garraway, L.A., Widlund, H.R., Rubin, M.A., Getz, G., Berger, A.J., Ramaswamy, S., Beroukhim, R., Milner, D.A., Granter, S.R., Du, J., Lee, C., Wagner, S.N., Li, C., Golub, T.R., Rimm, D.L., Meyerson, M.L., Fisher, D.E., and Sellers, W.R. 2005. Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436:117‐122.
	Gnerre, S., Maccallum, I., Przybylski, D., Ribeiro, F.J., Burton, J.N., Walker, B.J., Sharpe, T., Hall, G., Shea, T.P., Sykes, S., Berlin, A.M., Aird, D., Costello, M., Daza, R., Williams, L., Nicol, R., Gnirke, A., Nusbaum, C., Lander, E.S., and Jaffe, D.B. 2011. High‐quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl. Acad. Sci. U.S.A. 108:1513‐1518.
	Gusnanto, A., Wood, H.M., Pawitan, Y., Rabbitts, P. and Berri, S. 2012. Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next‐generation sequence data. Bioinformatics 28:40‐47.
	Hajirasouliha, I., Hormozdiari, F., Alkan, C., Kidd, J.M., Birol, I., Eichler, E.E., and Sahinalp, S.C. 2010. Detection and characterization of novel sequence insertions using paired‐end next‐generation sequencing. Bioinformatics 26:1277‐1283.
	Handsaker, R.E., Korn, J.M., Nemesh, J., and McCarroll, S.A. 2011. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43:269‐276.
	Haraksingh, R.R., Abyzov, A., Gerstein, M., Urban, A.E., and Snyder, M. 2011. Genome‐wide mapping of copy number variation in humans: Comparative analysis of high resolution array platforms. PLoS One 6:e27859.
	Hormozdiari, F., Alkan, C., Eichler, E.E., and Sahinalp, S.C. 2009. Combinatorial algorithms for structural variation detection in high‐throughput sequenced genomes. Genome Res. 19:1270‐1278.
	Hormozdiari, F., Hajirasouliha, I., McPherson, A., Eichler, E.E. and Sahinalp, S.C. 2011. Simultaneous structural variation discovery among multiple paired‐end sequenced genomes. Genome Res. 21:2203‐2212.
	Iafrate, A.J., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., Scherer, S.W., and Lee, C. 2004. Detection of large‐scale variation in the human genome. Nat. Genet. 36:949‐951.
	Ivakhno, S., Royce, T., Cox, A.J., Evers, D.J., Cheetham, R.K., and Tavare, S. 2010. CNAseg‐a novel framework for identification of copy number changes in cancer from second‐generation sequencing data. Bioinformatics 26:3051‐3058.
	Kent, W.J. 2002. BLAT‐the BLAST‐like alignment tool. Genome Res. 12:656‐664.
	Kim, T.M., Luquette, L.J., Xi, R., and Park, P.J. 2010. RSW‐Seq: Algorithm for detection of copy number alterations in deep sequencing data. BMC Bioinformatics 11:432.
	Koboldt, D.C., Zhang, Q., Larson, D.E., Shen, D., McLellan, M.D., Lin, L., Miller, C.A., Mardis, E.R., Ding, L., and Wilson, R.K. 2012. VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22:568‐576.
	Korbel, J.O., Abyzov, A., Mu, X.J., Carriero, N., Cayting, P., Zhang, Z., Snyder, M. and Gerstein, M.B. 2009. PEMer: A computational framework with simulation‐based error models for inferring genomic structural variants from massive paired‐end sequencing data. Genome Biol. 10:R23.
	Kuiper, R.P., Ligtenberg, M.J., Hoogerbrugge, N., and Geurts van, K.A. 2010. Germline copy number variation and cancer risk. Curr. Opin. Genet. Dev. 20:282‐289.
	Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. 2009. Ultrafast and memory‐efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25.
	Lee, S., Hormozdiari, F., Alkan, C. and Brudno, M. 2009. MoDIL: Detecting small indels from clone‐end sequencing with mixtures of distributions. Nat. Methods 6:473‐474.
	Lee, S., Xing, E. and Brudno, M. 2010. MoGUL: Detecting common insertions and deletions in a population. In Research in Computational Molecular Biology. pp. 357‐368. Springer.
	Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics 25:1754‐1760.
	Li, H. and Homer, N. 2010. A survey of sequence alignment algorithms for next‐generation sequencing. Brief. Bioinform. 11:473‐483.
	Li, H., Ruan, J., and Durbin, R. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18:1851‐1858.
	Li, J., Lupat, R., Amarasinghe, K.C., Thompson, E.R., Doyle, M.A., Ryland, G.L., Tothill, R.W., Halgamuge, S.K., Campbell, I.G., and Gorringe, K.L. 2012. CONTRA: Copy number analysis for targeted resequencing. Bioinformatics. 28:1307‐1313
	Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z., Li, Y., Li, S., Shan, G., Kristiansen, K., Li, S., Yang, H., Wang, J., and Wang, J. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265‐272.
	Li, Y., Zheng, H., Luo, R., Wu, H., Zhu, H., Li, R., Cao, H., Wu, B., Huang, S., Shao, H., Ma, H., Zhang, F., Feng, S., Zhang, W., Du, H., Tian, G., Li, J., Zhang, X., Li, S., Bolund, L., Kristiansen, K., de Smith, A.J., Blakemore, A.I., Coin, L.J., Yang, H., Wang, J., and Wang, J. 2011. Structural variation in two human genomes mapped at single‐nucleotide resolution by whole genome de novo assembly. Nat. Biotechnol. 29:723‐730.
	Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka, J., Braverman, M.S., Chen, Y.J., Chen, Z., Dewell, S.B., Du, L., Fierro, J.M., Gomes, X.V., Godwin, B.C., He, W., Helgesen, S., Ho, C.H., Irzyk, G.P., Jando, S.C., Alenquer, M.L., Jarvie, T.P., Jirage, K.B., Kim, J.B., Knight, J.R., Lanza, J.R., Leamon, J.H., Lefkowitz, S.M., Lei, M., Li, J., Lohman, K.L., Lu, H., Makhijani, V.B., McDade, K.E., McKenna, M.P., Myers, E.W., Nickerson, E., Nobile, J.R., Plant, R., Puc, B.P., Ronan, M.T., Roth, G.T., Sarkis, G.J., Simons, J.F., Simpson, J.W., Srinivasan, M., Tartaro, K.R., Tomasz, A., Vogt, K.A., Volkmer, G.A., Wang, S.H., Wang, Y., Weiner, M.P., Yu, P., Begley, R.F., and Rothberg, J.M. 2005. Genome sequencing in microfabricated high‐density picolitre reactors. Nature 437:376‐380.
	Miller, C.A., Hampton, O., Coarfa, C. and Milosavljevic, A. 2011. ReadDepth: A parallel R package for detecting copy number alterations from short sequencing reads. PLoS One 6:e16327.
	Nicol, J.W., Helt, G.A., Blanchard, S.G. Jr., Raja, A., and Loraine, A.E. 2009. The integrated genome browser: Free software for distribution and exploration of genome‐scale datasets. Bioinformatics 25:2730‐2731.
	Ning, Z., Cox, A.J., and Mullikin, J.C. 2001. SSAHA: A fast search method for large DNA databases. Genome Res. 11:1725‐1729.
	Nord, A.S., Lee, M., King, M.C. and Walsh, T. 2011. Accurate and exact CNV identification from targeted high‐throughput sequence data. BMC Genomics 12:184.
	Paisan‐Ruiz, C., Jain, S., Evans, E.W., Gilks, W.P., Simon, J., van der Brug, M., Lopez de, M.A., Aparicio, S., Gil, A.M., Khan, N., Johnson, J., Martinez, J.R., Nicholl, D., Carrera, I.M., Pena, A.S., de, S.R., Lees, A., Marti‐Masso, J.F., Perez‐Tur, J., Wood, N.W., and Singleton, A.B. 2004. Cloning of the gene containing mutations that cause PARK8‐linked Parkinson's Disease. Neuron 44:595‐600.
	Pinto, D., Darvishi, K., Shi, X., Rajan, D., Rigler, D., Fitzgerald, T., Lionel, A.C., Thiruvahindrapuram, B., Macdonald, J.R., Mills, R., Prasad, A., Noonan, K., Gribble, S., Prigmore, E., Donahoe, P.K., Smith, R.S., Park, J.H., Hurles, M.E., Carter, N.P., Lee, C., Scherer, S.W., and Feuk, L. 2011. Comprehensive assessment of array‐based platforms and calling algorithms for detection of copy number variants. Nat. Biotechnol. 29:512‐520.
	Qi, J. and Zhao, F. 2011. inGAP‐sv: A novel scheme to identify and visualize structural variation from paired end mapping data. Nucleic Acids Res. 39:W567‐W575.
	Quinlan, A.R., Clark, R.A., Sokolova, S., Leibowitz, M.L., Zhang, Y., Hurles, M.E., Mell, J.C. and Hall, I.M. 2010. Genome‐wide mapping and assembly of structural variant breakpoints in the mouse genome. Genome Res. 20:623‐635.
	Ramachandran, A., Micsinai, M. and Pe'er, I. 2011. CONDEX: Copy number detection in exome sequences. In Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference, pp. 87‐93.
	Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., Cho, E.K., Dallaire, S., Freeman, J.L., Gonzalez, J.R., Gratacos, M., Huang, J., Kalaitzopoulos, D., Komura, D., Macdonald, J.R., Marshall, C.R., Mei, R., Montgomery, L., Nishimura, K., Okamura, K., Shen, F., Somerville, M.J., Tchinda, J., Valsesia, A., Woodwark, C., Yang, F., Zhang, J., Zerjal, T., Zhang, J., Armengol, L., Conrad, D.F., Estivill, X., Tyler‐Smith, C., Carter, N.P., Aburatani, H., Lee, C., Jones, K.W., Scherer, S.W. and Hurles, M.E. 2006. Global variation in copy number in the human genome. Nature 444:444‐454.
	Rigaill, G.J., Cadot, S., Kluin, R.J.C., Xue, Z., Bernards, R., Majewski, I.J., and Wessels, L.F.A. 2012. A regression model for estimating DNA copy number data applied to capture sequencing data. Bioinformatics 28:2357‐2365.
	Robinson, J.T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. 2011. Integrative genomics viewer. Nat. Biotechnol. 29:24‐26.
	Sathirapongsasuti, J.F., Lee, H., Horst, B.A., Brunner, G., Cochran, A.J., Binder, S., Quackenbush, J. and Nelson, S.F. 2011. Exome sequencing‐based copy‐number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics 27:2648‐2654.
	Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese‐Martin, C., Walsh, T., Yamrom, B., Yoon, S., Krasnitz, A., Kendall, J., Leotta, A., Pai, D., Zhang, R., Lee, Y.H., Hicks, J., Spence, S.J., Lee, A.T., Puura, K., Lehtimaki, T., Ledbetter, D., Gregersen, P.K., Bregman, J., Sutcliffe, J.S., Jobanputra, V., Chung, W., Warburton, D., King, M.C., Skuse, D., Geschwind, D.H., Gilliam, T.C., Ye, K., and Wigler, M. 2007. Strong association of de novo copy number mutations with autism. Science 316:445‐449.
	Shendure, J., Porreca, G.J., Reppas, N.B., Lin, X., McCutcheon, J.P., Rosenbaum, A.M., Wang, M.D., Zhang, K., Mitra, R.D., and Church, G.M. 2005. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309:1728‐1732.
	Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., and Birol, I. 2009. ABySS: A parallel assembler for short read sequence data. Genome Res. 19:1117‐1123.
	Sindi, S., Helman, E., Bashir, A., and Raphael, B.J. 2009. A geometric approach for classification and comparison of structural variants. Bioinformatics 25:i222‐i230.
	Stefansson, H., Rujescu, D., Cichon, S., Pietilainen, O.P., Ingason, A., Steinberg, S., Fossdal, R., Sigurdsson, E., Sigmundsson, T., Buizer‐Voskamp, J.E., Hansen, T., Jakobsen, K.D., Muglia, P., Francks, C., Matthews, P.M., Gylfason, A., Halldorsson, B.V., Gudbjartsson, D., Thorgeirsson, T.E., Sigurdsson, A., Jonasdottir, A., Jonasdottir, A., Bjornsson, A., Mattiasdottir, S., Blondal, T., Haraldsson, M., Magnusdottir, B.B., Giegling, I., Moller, H.J., Hartmann, A., Shianna, K.V., Ge, D., Need, A.C., Crombie, C., Fraser, G., Walker, N., Lonnqvist, J., Suvisaari, J., Tuulio‐Henriksson, A., Paunio, T., Toulopoulou, T., Bramon, E., Di, F.M., Murray, R., Ruggeri, M., Vassos, E., Tosato, S., Walshe, M., Li, T., Vasilescu, C., Muhleisen, T.W., Wang, A.G., Ullum, H., Djurovic, S., Melle, I., Olesen, J., Kiemeney, L.A., Franke, B., Sabatti, C., Freimer, N.B., Gulcher, J.R., Thorsteinsdottir, U., Kong, A., Andreassen, O.A., Ophoff, R.A., Georgi, A., Rietschel, M., Werge, T., Petursson, H., Goldstein, D.B., Nothen, M.M., Peltonen, L., Collier, D.A., St, C.D. and Stefansson, K. 2008. Large recurrent microdeletions associated with schizophrenia. Nature 455:232‐236.
	Steinthorsdottir, V., Thorleifsson, G., Reynisdottir, I., Benediktsson, R., Jonsdottir, T., Walters, G.B., Styrkarsdottir, U., Gretarsdottir, S., Emilsson, V., Ghosh, S., Baker, A., Snorradottir, S., Bjarnason, H., Ng, M.C., Hansen, T., Bagger, Y., Wilensky, R.L., Reilly, M.P., Adeyemo, A., Chen, Y., Zhou, J., Gudnason, V., Chen, G., Huang, H., Lashley, K., Doumatey, A., So, W.Y., Ma, R.C., Andersen, G., Borch‐Johnsen, K., Jorgensen, T., van Vliet‐Ostaptchouk, J.V., Hofker, M.H., Wijmenga, C., Christiansen, C., Rader, D.J., Rotimi, C., Gurney, M., Chan, J.C., Pedersen, O., Sigurdsson, G., Gulcher, J.R., Thorsteinsdottir, U., Kong, A., and Stefansson, K. 2007. A variant in CDKAL1 influences insulin response and risk of Type 2 Diabetes. Nat. Genet. 39:770‐775.
	Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., Haugen, E., Hayden, H., Albertson, D., Pinkel, D., Olson, M.V., and Eichler, E.E. 2005. Fine‐scale structural variation of the human genome. Nat. Genet. 37:727‐732.
	Wang, J., Mullighan, C.G., Easton, J., Roberts, S., Heatley, S.L., Ma, J., Rusch, M.C., Chen, K., Harris, C.C., Ding, L., Holmfeldt, L., Payne‐Turner, D., Fan, X., Wei, L., Zhao, D., Obenauer, J.C., Naeve, C., Mardis, E.R., Wilson, R.K., Downing, J.R., and Zhang, J. 2011. CREST maps somatic structural variation in cancer genomes with base‐pair resolution. Nat. Methods 8:652‐654.
	Xi, R., Hadjipanayis, A.G., Luquette, L.J., Kim, T.M., Lee, E., Zhang, J., Johnson, M.D., Muzny, D.M., Wheeler, D.A., Gibbs, R.A., Kucherlapati, R., and Park, P.J. 2011. Copy number variation detection in whole‐genome sequencing data using the Bayesian information criterion. Proc. Natl. Acad. Sci. U.S.A. 108:E1128‐E1136.
	Xie, C. and Tammi, M.T. 2009. CNV‐Seq, a new method to detect copy number variation using high‐throughput sequencing. BMC Bioinformatics 10:80.
	Ye, K., Schulz, M.H., Long, Q., Apweiler, R., and Ning, Z. 2009. Pindel: A pattern growth approach to detect break points of large deletions and medium sized insertions from paired‐end short reads. Bioinformatics 25:2865‐2871.
	Yoon, S., Xuan, Z., Makarov, V., Ye, K., and Sebat, J. 2009. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 19:1586‐1592.
	Yoshihara, K., Tajima, A., Adachi, S., Quan, J., Sekine, M., Kase, H., Yahata, T., Inoue, I., and Tanaka, K. 2011. Germline copy number variations in BRCA1‐associated ovarian cancer patients. Genes Chromosomes Cancer 50:167‐177.
	Zeitouni, B., Boeva, V., Janoueix‐Lerosey, I., Loeillet, S., Legoix‐né, P., Nicolas, A., Delattre, O. and Barillot, E. 2010. SVDetect: A tool to identify genomic structural variations from paired‐end and mate‐pair sequencing data. Bioinformatics 26:1895‐1896.
	Zerbino, D.R. and Birney, E. 2008. Velvet: Algorithms for de novo short read assembly using de bruijn graphs. Genome Res. 18:821‐829.
	Zhang, J. and Wu, Y. 2011. SVseq: An approach for detecting exact breakpoints of deletions with low‐coverage sequence data. Bioinformatics 27:3228‐3234.
	Zhang, Z.D., Du, J., Lam, H., Abyzov, A., Urban, A.E., Snyder, M., and Gerstein, M. 2011. Identification of genomic indels and structural variations using split reads. BMC Genomics 12:375.
	Zhao, X., Weir, B.A., LaFramboise, T., Lin, M., Beroukhim, R., Garraway, L., Beheshti, J., Lee, J.C., Naoki, K., Richards, W.G., Sugarbaker, D., Chen, F., Rubin, M.A., Janne, P.A., Girard, L., Minna, J., Christiani, D., Li, C., Sellers, W.R., and Meyerson, M. 2005. Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis. Cancer Res. 65:5561‐5570.