发明名称 Accurate and fast mapping of reads to genome
摘要 Accurate and fast mapping of sequencing reads obtained from a targeted sequencing procedure can be provided. Once a target region is selected, alternate regions of the genome that are sufficiently similar to the target region can be identified. If a sequencing read is more similar to the target region than to an alternate region, then the read can be determined as aligning to the target region. The reads aligning to the target region can then be analyzed to determine whether a mutation exists in the target region. Accordingly, a sequencing read can be compared to the target region and the corresponding alternate regions, and not to the entire genome, thereby providing computational efficiency.
申请公布号 US9218450(B2) 申请公布日期 2015.12.22
申请号 US201213689314 申请日期 2012.11.29
申请人 Roche Molecular Systems, Inc. 发明人 Chen Xiaoying;Li Yan;Liu Wei-Min;Ma Xiaoju (Max);Truong Sim-Jasmine
分类号 G01N33/48;G06F19/22 主分类号 G01N33/48
代理机构 Kilpatrick Townsend & Stockton LLP 代理人 Kilpatrick Townsend & Stockton LLP
主权项 1. A method of detecting variants in a target region in a sample genome of an organism, the method comprising: receiving a plurality of sequence reads, the sequence reads obtained from sequencing genomic segments in a sample obtained from the organism, wherein the sequencing includes targeting genomic segments from the target region; identifying one or more alternate regions in a reference genome that have a respective first number of variations from the target region in the reference genome, each respective first number being greater than one and less than a first threshold number; performing, with a computer system, an alignment of the plurality of sequence reads to the target region in the reference genome to identify a set of sequence reads that align to the target region in the reference genome with less than a second threshold number of variations; removing from the set a sequence read that is aligned to one of the one or more alternate regions in the reference genome with a second number of variations that is less than a third threshold number; and analyzing the remaining sequence reads of the set to determine variants in the target region in the sample genome.
地址 Pleasanton CA US
您可能感兴趣的专利