发明名称 Analyzing genome sequencing information to determine likelihood of co-segregating alleles on haplotypes
摘要 Sequencing information is used to correlate alleles at certain locations to alleles at other locations. The statistical information from the reads of fragments in a sample can be used to determine the phasing of haplotypes and to correct or confirm based calls at the locations. In one example, a confidence value (strength score) is determined for a particular hypothesis, which can include whether two alleles are on a same haplotype at two particular loci, as well as what the alleles are on another haplotype (e.g. for a diploid organism). The strength can include a positive contribution from data that is consistent with the hypothesis and a negative contribution from data is that inconsistent with the hypothesis, where both values can be used in a formula to determine the strength.
申请公布号 US8880456(B2) 申请公布日期 2014.11.04
申请号 US201213591741 申请日期 2012.08.22
申请人 Complete Genomics, Inc. 发明人 Kermani Bahram Ghaffarzadeh;Drmanac Radoje
分类号 G06F9/44;G06N7/02;G06N7/08;G06F19/18;G06F19/22 主分类号 G06F9/44
代理机构 Kilpatrick Townsend & Stockton LLP 代理人 Kilpatrick Townsend & Stockton LLP ;Raczkowski David B.
主权项 1. A method of determining at least part of a genome of an organism from one or more samples, the one or more samples including nucleic acid molecules of the organism, the method comprising: receiving sequencing information of a plurality of the nucleic acid molecules in the one or more samples; identifying a plurality of loci of a first chromosome; computing, with a computer system, a first strength conveying a likelihood that a first allele of a first locus and a second allele of a second locus are on a first 2-allele haplotype of the organism, wherein computing the first strength includes: determining a first positive contribution to the likelihood based on sequencing information consistent with the first allele and the second allele being on the first 2-allele haplotype;determining a first negative contribution to the likelihood based on sequencing information inconsistent with the first allele and the second allele being on the first 2-allele haplotype; andusing the first positive and first negative contributions to compute the first strength; and calculating two haplotypes involving the first locus and the second locus using the first strength.
地址 Mountain View CA US