Targeted Exon Sequencing by In‐Solution Hybrid Selection

互联网2013-12-31

1986

Abstract
Table of Contents
Materials
Figures
Literature Cited

Abstract

This unit describes a protocol for the targeted enrichment of exons from randomly sheared genomic DNA libraries using an in?solution hybrid selection approach for sequencing on an Illumina Genome Analyzer II. The steps for designing and ordering a hybrid selection oligo pool are reviewed, as are critical steps for performing the preparation and hybrid selection of an Illumina paired?end library. Critical parameters, performance metrics, and analysis workflow are discussed. Curr. Protoc. Hum. Genet. 66:18.4.1?18.4.24 © 2010 by John Wiley & Sons, Inc.

Keywords: exon sequencing; hybrid selection; mutation discovery; DNA sequencing; targeting

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Strategic Planning
Basic Protocol 1: DNA Fragmentation
Basic Protocol 2: Paired‐End Library Preparation
Basic Protocol 3: Hybrid Selection
Basic Protocol 4: Library Quantification by qPCR
Support Protocol 1: Read Alignment and Evaluation of Sequence Data
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: DNA Fragmentation

Materials

DNA sample (e.g., see appendix 3B )
Nuclease‐free water
70% (v/v) ethanol

NanoDrop ND‐1000 spectrophotometer
Covaris S‐2 Sample Preparation System
VWR circulating chiller
Covaris shearing vial (6 × 16−mm AFA fiber vial; cat. no. 520045)
1.5‐ml microcentrifuge tube
Agencourt AMPure XP kit (Beckman Coulter, cat. no. A63881)
Magnetic separator (DynaMag Spin Magnet, Invitrogen, cat. no. 123‐20D)

Additional reagents and equipment for DNA quantitation ( appendix 3D ) and agarose gel electrophoresis (unit 2.7 )

Basic Protocol 2: Paired‐End Library Preparation

Materials

Illumina Paired End Sample Prep Kit (cat. no. PE‐102‐1001), containing:
- 10× T4 DNA ligase buffer w/10 mM ATP
- T4 polynucleotide kinase
- T4 DNA polymerase
- Klenow fragment (3′→5′ exo) and Klenow buffer
- 10 mM dNTP mix
- 1 mM dATP
- DNA ligase and 2× buffer
Nuclease‐free water
Sheared, cleaned DNA sample (see protocol 1 )
Paired‐end oligo mix (Illumina)
2× Phusion high‐fidelity PCR master mix (Finnzymes, cat. no. F‐531S)
PCR primers, 100 µM each:
- PE1.0: AAT GATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
- PE2.0: CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT

96‐well PCR plate
Thermocycler

Additional reagents and equipment for cleaning DNA with AMPure beads (see protocol 1 ), agarose gel electrophoresis (unit 2.7 ), and DNA quantitation ( appendix 3D )

Basic Protocol 3: Hybrid Selection

Materials

Adapter‐ligated DNA (see protocol 2 )
50× Denhardt's solution
20× SSPE
Nuclease‐free water
10% SDS
0.5 M EDTA
1.0 mg/ml human Cot‐1 DNA (Invitrogen, cat. no. 15279‐101)
10.0 mg/ml salmon sperm DNA (Invitrogen, cat. no. 15632‐011)
Blocking oligos (200 µM each, custom oligos from IDT)
- Oligo 1.0: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
- Oligo 2.0: CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT
100 ng/µl Biotinylated RNA Oligo Library (Agilent Technologies SureSelect)
20 U/µl Superase‐In RNAse Inhibitor (Applied Biosystems, cat. no. AM2694)
Dynabeads M‐280 Streptavidin Beads (Invitrogen, cat. no. 112‐05D)
5 M NaCl
1 M Tris‐Cl
20× SSC
0.1 N NaOH
2× Phusion high‐fidelity PCR master mix (Finnzymes, cat. no. F‐531S)
PCR primers, 100 µM each:
- PE1.0: AAT GATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
- PE2.0: CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCT

NanoDrop ND‐1000 spectrophotometer
Speedvac evaporator
65°C heating block with 1.5‐ml tube holder
96‐well PCR plates
1.5‐ml microcentrifuge tubes
Adhesive plate seal
96‐well thermocycler with heated lid
50 ml conical tube
Magnetic separator (DynaMag Spin Magnet, Invitrogen, cat. no. 123‐20D)

Additional reagents and equipment for DNA quantitation ( appendix 3D ) and cleaning DNA with AMPure beads (see protocol 1 )

Basic Protocol 4: Library Quantification by qPCR

Materials

10 nM PhiX Control Library (Illumina, cat. no. 1006471)
Nuclease‐free water
Target‐selected DNA library (see protocol 3 )
2× Brilliant SYBR Green QPCR Master Mix (Stratagene, cat. no. 600548)
1 mM ROX Reference Dye
1.25 µM P5 PCR primer (AATGATACGGCGACCACCGA)
1.25 µM P7 PCR primer (CAAGCAGAAGACGGCATACGA)

384 well MicroAmp Optical Reaction Plate (Applied Biosystems, cat. no. 4326270)
MicroAmp Optical Adhesive Film (Applied Biosystems, cat. no. 4311976)
ABI 7900HT Real‐Time PCR System with SDS V2.3 software (Applied Biosystems)

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 18.4.1 Targets, baits, and nomenclature. Sequencing reads can fall into several categories depending on where they align along a targeted region of the genome. Bases aligning to the exact targeted sequence are considered “on target.” Because RNA bait sequences can hang off the ends of the actual target, aligned bases can be “off target” but “on bait.” Additionally, because randomly sheared fragments vary in size, it is realistic to expect a proportion of aligned bases to be “near bait,” which is considered ±250 bp of the bait sequence. Metrics calculating the percentage of bases falling into these categories are helpful in understanding the performance of a hybrid selection experiment.

View Image

Figure 18.4.2 Sheared genomic DNA size distribution. High‐quality genomic DNA was sheared using the Covaris instrument. Unsheared gDNA (100 ng) and sheared DNA (200 ng) were run in parallel on a 2% agarose gel. After shearing, the bulk of the fragments should run between ∼100 and 400 bp.

View Image

Figure 18.4.3 qPCR library quantification. Real‐time SYBR Green qPCR is used for accurate quantification of libraries prior to sequencing. An accurate quantitation is essential for calculating the amount of library to be loaded onto a flow cell for optimal cluster density and high sequence yields. Shown in this figure are the amplification plots for a two‐fold serial dilution standard curve as well as four libraries, all run in triplicate. The standard curve is plotted and used to calculate the concentration of each library.

View Image

Figure 18.4.4 Hybrid selection visualization using the Integrative Genomics Viewer (IGV). After analysis, sequence BAM files are loaded into the IGV. (A ) Exons of varying lengths on the BRCA1 gene can be seen in the lower RefSeq Gene track represented by thick blue bars. In the upper sequencing read track, aligned reads can be seen piling up over the targeted regions showing deep coverage of the exonic regions and minimal off‐target sequencing. (B ) Zooming in to a higher base‐pair resolution allows actual mutations to be observed in comparison to a reference sequence.

View Image

Videos

Literature Cited

Literature Cited
	Bashiardes, S., Veile, R., Helms, C., Mardis, E.R., Bowcock, A.M., and Lovett, M. 2005. Direct genomic selection. Nat. Methods 2:63‐69.
	Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., et al. 2008. Accurate whole genome sequencing using reversible terminator chemistry. Nature 465:53‐59.
	Gnirke, A., Melnikov, A., Maguire, J., Rogov, P., LeProust, E.M., Brockman, W., Fennell, T., Giannoukos, G., Fisher, S., Russ, C., et al. 2009. Solution hybrid selection with ultra‐long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27:182‐189.
	Harismendy, O., Ng, P.C., Strausberg, R.L., Wang, X., Stockwell, T.B., Beeson, K.Y., Schork, N.J., Murray, S.S., Topol, E.J., Levy, S., and Frazer, K.A. 2009. Evaluation of next generation sequencing platforms for population targeted sequening studies. Genome Biol. 10:R32.
	Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al.; International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature 409:860‐921.
	Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics 25:1754‐1760.
	Li, H., Ruan, J., and Durbin, R. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18:1851‐1858.
	Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R.; 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078‐2079.
	Li, J.B., Gao, Y., Aach, J., Zhang, K., Kryukov, G.V., Xie, B., Ahlford, A., Yoon, J.‐K., Rosenbaum, A.M., Zaranek, A.W., LeProust, E., Sunyaev, S.R., and Church, G.M. 2008. Multiplex padlock targeted sequencing reveals human hypermutable CpG variations. Genome Res. 9:1606‐1615.
	Mardis, E.R. 2008. The impact of next‐generation sequencing technology on genetics. Trends Genet. 2:133‐141.
	Ng, S.B., Turner, E.H., Robertson, P.D., Flygare, S.D., Bigham, A.W., Lee, C., Shaffer, T., Wong, M., Bhattacharjee, A., Eichler, E.E., Bamshad, M., Nickerson, D.A., and Shendure, J. 2009. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461:272‐278.
	Quail, M.A., Kozarewa, I., Smith, F., Scally, A., Stephens, P.J., Durbin, R., Swerdlow, H., and Turner, D.J. 2008. A large genome center's improvements to the Illumina sequencing system. Nat. Methods 5:1005‐1010.
	Raymond, F.L., Whibley, A., Stratton, M.R., and Gecz, J. 2009. Lessons learnt from large‐scale exon re‐sequencing of the X chromosome. Hum. Mol. Genet. 18:60‐64.
	Sanger, F., Nicklen, S., and Coulson, A.R. 1977. DNA sequencing with chain‐terminating inhibitors. Proc. Natl. Acad. Sci. U.S.A. 74:5463‐5467.
	Shendure, J. and Ji, H. 2008. Next‐generation DNA sequencing. Nat. Biotechnol. 26:1135‐1145.
	Sjöblom, T., Jones, S., Wood, L.D., Parsons, D.W., Lin, J., Barber, T.D., Mandelker, D., Leary, R.J., Ptak, J., Silliman, N., et al. 2006. The consensus coding sequences of human breast and colorectal cancers. Science 314:268‐274.
	Thomas, R.K., Nickerson, E., Simons, J.F., Jänne, P.A., Tengs, T., Yuza, Y., Garraway, L.A., LaFramboise, T., Lee, J.C., Shah, K., et al. 2006. Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nat. Methods 12:852‐855.