丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

Generating a Genome Assembly with PCAP

互联网

872
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

This unit describes how to use the Parallel Contig Assembly Program (PCAP) to assemble the data produced by a whole?genome shotgun sequencing project. We present a basic protocol for using PCAP on a multiprocessor computer in a 300?Mb genome assembly project. A support protocol to prepare input files for PCAP is also described. Another basic protocol for using PCAP on a distributed cluster of computers in a 3?Gb genome assembly project is presented, in addition to suggestions for understanding results from PCAP.

Keywords: Whole?Genome Shotgun Sequencing; Genome Assembly

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Basic Protocol 1: Producing an Assembly with PCAP Using an Example Data Set
  • Support Protocol 1: Downloading and Installing PCAP
  • Support Protocol 2: Preparation of Input Files
  • Support Protocol 3: Generating the fofn.con File
  • Basic Protocol 2: Generating a Large‐Scale Assembly with PCAP Using Distributed Computing
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure Figure 11.3.1 The top part of the contigs.bases file produced on the example data set.
    View Image
  •   Figure Figure 11.3.2 The entire content of the supercontigs file produced on the example data set.
    View Image
  •   Figure Figure 11.3.3 The top part of the reads.placed file produced on the example data set.
    View Image
  •   Figure Figure 11.3.4 The entire content of the reads.unplaced file produced on the example data set.
    View Image
  •   Figure Figure 11.3.5 The entire content of the readpairs.contigs file produced on the example data set.
    View Image
  •   Figure Figure 11.3.6 The top part of the readpairs.reads file produced on the example data set.
    View Image
  •   Figure Figure 11.3.7 The top part of the fofn.con.pcap.results file produced on the example data set.
    View Image
  •   Figure Figure 11.3.8 The entire content of the fofn.con.pcap.sort.stat file produced on the example data set.
    View Image
  •   Figure Figure 11.3.9 The middle part of the fofn.pcap.n50 file produced on the example data set.
    View Image
  •   Figure Figure 11.3.10 The entire content of the fofn.pcap.contigs1.snp file produced on the example data set.
    View Image
  •   Figure Figure 11.3.11 Specification of read pairs in the .con file when the same subclone is sequenced multiple times.
    View Image
  •   Figure Figure 11.3.12 The top part of the fofn.con file for the example data set.
    View Image

Videos

Literature Cited

   Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403‐410.
   Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J.M., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A.F., Gelpke, M.D., Roach, J., Oh, T., Ho, I.Y., Wong, M., Detter, C., Verhoef, F., Predki, P., Tay, A., Lucas, S., Richardson, P., Smith, S.F., Clark, M.S., Edwards, Y.J., Doggett, N., Zharkikh, A., Tavtigian, S.V., Pruss, D., Barnstead, M., Evans, C., Baden, H., Powell, J., Glusman, G., Rowen, L., Hood, L., Tan, Y.H., Elgar, G., Hawkins, T., Venkatesh, B., Rokhsar, D., and Brenner, S. 2002. Whole‐genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297:1301‐1310.
   Havlak, P., Chen, R., Durbin, K.J., Egan, A., Ren, Y., Song, X.‐Z., Weinstock, G.M., and Gibbs, R. 2004. The Atlas genome assembly system. Genome Res. 14:721‐732.
   Huang, X. and Madan, A. 1999. CAP3: A DNA sequence assembly program. Genome Res. 9:868‐877.
   Huang, X., Wang, J., Aluru, S., Yang, S.‐P., and Hillier, L. 2003. PCAP: A whole‐genome assembly program. Genome Res. 13:2164‐2170.
   Jaffe, D.B., Butler, J., Gnerre, S., Mauceli, E., Lindblad‐Toh, K., Mesirov, J.P., Zody, M.C. and Lander, E.S. 2003. Whole‐genome sequence assembly for mammalian genomes: ARACHNE 2. Genome Res. 13:91‐96.
   Kent, W.J. 2002. BLAT: The BLAST‐like alignment tool. Genome Res. 12:656‐664.
   Kruskal, J.B. 1956. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Amer. Math. Soc. 7:48‐50.
   Mullikin, J.C. and Ning, Z. 2003. The Phusion assembler. Genome Res. 13:81‐90.
   Myers, E.W., Sutton, G.G., Delcher, A.L., Dew, I.M., Fasulo, D.P., Flanigan, M.J., Kravitz, S.A., Mobarry, C.M., Reinert, K.H., Remington, K.A., Anson, E.L., Bolanos, R.A., Chou, H.H., Jordan, C.M., Halpern, A.L., Lonardi, S., Beasley, E.M., Brandon, R.C., Chen, L., Dunn, P.J., Lai, Z., Liang, Y., Nusskern, D.R., Zhan, M., Zhang, Q., Zheng, X., Rubin, G.M., Adams, M.D., and Venter, J.C. 2000. A whole‐genome assembly of Drosophila. Science 287:2196‐2204.
   Needleman, S.B. and Wunsch, C.D. 1970. A general method applicable to the search for similarities in the amino acid sequences of two proteins. J. Mol. Biol. 48:443‐453.
   Pearson, W.R. and Lipman, D. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A. 85:2444‐2448.
   Smith, T.F. and Waterman, M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195‐197.
Key References
   Huang et al., 2003. See above.
   This article describes the methods used in PCAP in detail.
Internet Resources
   http://seq.cs.iastate.edu
   This site contains documentation on PCAP and example test data sets.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
ad image
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
扫码领资料
反馈
TOP
打开小程序