• 我要登录|
  • 免费注册
    |
  • 我的丁香通
    • 企业机构:
    • 成为企业机构
    • 个人用户:
    • 个人中心
  • 移动端
    移动端
丁香通 logo丁香实验_LOGO
搜实验

    大家都在搜

      大家都在搜

        0 人通过求购买到了急需的产品
        免费发布求购
        发布求购
        点赞
        收藏
        wx-share
        分享

        The TRC shRNA Design Process

        互联网

        1315

         

        Overview

        • We design shRNA molecules with an algorithm. Our algorithm uses several criteria to rank potential 21mer targets within each human and mouse Refseq transcript. The algorithm applies a set of rules, including those derived from the siRNA literature, our cloning scheme, constraints on the synthesis of the oligonucleotides and others. In applying the algorithm, our aim is to achieve a balance of two competing goals: make hairpins that effectively knockdown the target transcript and, as best possible, design hairpins that knockdown only one gene and not other so-called ''off-target'' genes. Each goal presents distinct challenges. The criteria for predicting effective knockdown with either siRNA or shRNA are not well understood. Our rules are primarily derived from the siRNA literature; how well these rules apply to shRNA design is unclear. Genome evolution constrains target specificity. Many genes are part of extensive gene families, which may make targeting any one gene difficult. Functionally distinct genes share many motifs. Our knowledge of transcript structure and variants is still very incomplete as well. For all these reasons and more, we construct 5 shRNAs for each transcript with the expectation of getting a range of knockdown efficiencies across the set and at least one or two which knockdown effectively.
        • Users of this database should be aware that in order to have consistent and reliable annotation, the TRC consortium decided early on to use NCBI''s REFSEQ collection of transcripts as the definitive source of information for the primary target sequence for the design of shRNA molecules.
        • As a general rule in the construction of the library, we construct shRNA molecules targeting just the first Refseq transcript reported from each NCBI gene. In part due to our design process, see below, the majority of the shRNAs target all known transcript variants.

        A brief narrative of the candidate selection process

        • Get the Candidate Sequences
          For each human and mouse Refseq transcript, we generate all 21mers starting 25 bp after the beginning of the CDS up to those starting 150 bp from the end of the transcript. Each 21mer is called a ''candidate''.
        • Score the Candidate Sequences For Knockdown Efficiency
          Each candidate is given an "original score" by applying a set of rules that either penalize or reward features predicting successful knockdown and clone-design considerations, and then calculating the product of all the penalties/rewards. The individual rules are listed below. The candidates are then sorted by score and all those above a minimum score are stored.
        • Score the Candidates Sequences for Specificity
          We are forced to balance the prediction of knockdown efficiency against the desire to minimize interaction with off-target genes, without a clear understanding of just how to predict off-target "hits". We calculate a "specificity score" to promote candidates without obvious off-target transcripts. Each candidate is compared by BLASTN to two distinct abstractions of the transcriptome: the NCBI Unigene "unique" database (vaguely defined by NCBI as the "longest, best" sequence from each unigene cluster), and the transcripts from Refseq. We deem a ''miss'' any sequence pair with at least three differences, with at least two of the differences in the core positions 3-19, i.e., not on the ends of the 21mer target region. We then determine if each candidate hits one unigene cluster, one Locuslink transcript, one Locuslink gene, and for those genes with muliple transcripts, all the the transcripts in the gene. Using just the "hits-One-Unigene" and the "hits-One-NM" values, we apply a "specificity score" to each candidate whereby candidates that uniquely hit one unigene cluster AND one Locuslink transcript are rewarded, those that hit one unigene OR one Locuslink transcript are rewarded, but less so, and those that had neither unigene or Locuslink specificity are penalized. After determining and storing this "specificityScore", we resort the candidates.
        • Spacing the candidate 21mers along the transcript
          Since we synthesize 5 oligo pairs for each transcript, and since we hypothesize a role for the secondary structure of the target transcript in the effectiveness of an shRNA, we want to have the candidates spread out along the transcript, with one from the 3-prime UTR region and 4 along the CDS. To pick the five candidates, the highest scoring three-prime UTR candidate, if available, is chosen first. Next the top scoring candidate among the CDS candidates is chosen. A position-penalty is then applied to all the other CDS candidates, where the penalty is more severe the closer the candidate is to the first CDS candidate picked. After applying the position penalty, all the CDS candidates are resorted by their newly calculated, position-weighted score. From the list of remaining CDS candidates, the highest-scoring candidate is chosen and the position penalty is applied to all the remaining candidates based upon the already picked CDS candidates. This process is repeated until all the candidates are rescored. Finally the top 5 position-, specificity-weighted candidates are chosen for oligo synthesis.

         

        ad image
        提问
        扫一扫
        丁香实验小程序二维码
        实验小助手
        丁香实验公众号二维码
        扫码领资料
        反馈
        TOP
        打开小程序