• 我要登录|
  • 免费注册
    |
  • 我的丁香通
    • 企业机构:
    • 成为企业机构
    • 个人用户:
    • 个人中心
  • 移动端
    移动端
丁香通 logo丁香实验_LOGO
搜实验

    大家都在搜

      大家都在搜

        0 人通过求购买到了急需的产品
        免费发布求购
        发布求购
        点赞
        收藏
        wx-share
        分享

        请教关于PMF搜索数据库的算法问题

        丁香园论坛

        764
        不知这里有没有人研究PROTEOMICS中根据PEPTIDE MASS FINGERPRINT(PMF)搜索数据库来鉴定蛋白质的算法问题的。
        本人最近在研究其中的一种算法 probability based algorithm,其代表软件是Mascot, 其前身是MOWSE,但查了很多文献,具体谈到细节的很少,只有如下一些:

        “MOWSE Scoring scheme

        The final scoring scheme is based on the frequency of a
        fragment molecular weight being found in a protein of a given
        range of molecular weight. OWL database sequence entries were
        initially grouped into 10 kDalton intact molecular weight
        intervals. For each 10 kDalton protein interval, peptide fragment
        molecular weights were assigned to cells of 100 Dalton intervals.
        The cells therefore contained the number of times a particular
        fragment molecular weight occurred in a protein of any given size.
        This operation was performed for each enzyme. Cell frequency
        values were calculated by dividing each cell value by the total
        number of peptides in each 10 kD protein interval. Cell frequency
        values for each 10 kDalton interval were then normalised to the
        largest cell value (Fmax), with all the cell values recalculated
        as:

        Cell value = Old value / Fmax

        to yield floating point numbers between 0 and 1. These
        distribution frequency values, calculated for each cleavage
        reagent, were then built into the MOWSE search program. For
        every database entry scanned, all matching fragments contribute to
        the final score. In the current implementation, non-matching
        fragments are ignored (neutral). For each matching peptide Mw a
        score is assigned by looking up the appropriate normalised
        distribution frequency value. In the case of multiple 'hits' in
        any one target protein (i.e. more than one matching peptide Mw),
        the distribution frequency scores are multiplied. The final
        product score is inverted and then normalised to an 'average'
        protein Mw of 50 kDaltons to reduce the influence of random score
        accumulation in large proteins (>200 kDaltons). The final score is
        thus calculated as:

        Score = 50/(Pn x H)

        Where Pn is the product of n distribution scores and H the 'hit'
        protein molecular weight in kD.

        Important consequences of this type of scoring scheme
        are that matches with peptides of higher Mw carry more scoring
        weight, and that the non-random distribution of fragment molecular
        weights in proteins of different sizes is compensated for.


        也就是只有两个公式。
        不知各位大虾中有无对此有研究的,请指教,特别是有无关于Mascot 算法、实现具体的较详细的资料。各位有兴趣的同仁也可不吝帖子,讨论讨论。

        关于另外一个PMF搜索软件 proFound的算法的资料、观点也请指教
        ad image
        提问
        扫一扫
        丁香实验小程序二维码
        实验小助手
        丁香实验公众号二维码
        扫码领资料
        反馈
        TOP
        打开小程序