发明名称 Identification of DNA fragments and structural variations
摘要 Various short reads can be grouped and identified as coming from a same long DNA fragment (e.g., by using wells with a relatively low-concentration of DNA). A histogram of the genomic coverage of a group of short reads can provide the edges of the corresponding long fragment (pulse). The knowledge of these pulses can provide an ability to determine the haploid genome and to identify structural variations.
申请公布号 US9514272(B2) 申请公布日期 2016.12.06
申请号 US201213649966 申请日期 2012.10.11
申请人 Complete Genomics, Inc. 发明人 Kermani Bahram Ghaffarzadeh;Drmanac Radoje;Alferov Oleg
分类号 G01N33/50;G06F19/22 主分类号 G01N33/50
代理机构 Kilpatrick Townsend & Stockton LLP 代理人 Kilpatrick Townsend & Stockton LLP ;Raczkowski David B.
主权项 1. A method of detecting a structural variation in a genome of a sample from an organism, the method comprising: providing a plurality of aliquots, each aliquot comprising nucleic acid molecules of the genome that have barcodes to track from which aliquot a nucleic acid molecule originates, wherein each of the plurality of aliquots includes less than a genomic equivalent of the genome of the organism; sequencing a plurality of nucleic acid molecules in the plurality of aliquots to obtain sequences of the plurality of nucleic acid molecules and the barcodes; receiving, at a computer system, sequence data from the sequencing the sequence data including sequences of at least one portion of each nucleic acid molecule of the plurality of nucleic acid molecules, wherein the sequence data includes the barcodes for tracking from which aliquot a sequence originates; for each of the plurality of nucleic acid molecules: aligning, with the computer system, at least one sequence of the nucleic acid molecule to a reference genome; for each of the plurality of aliquots, calculating, with the computer system, a histogram for a first chromosomal region by: identifying a respective group of sequences as being derived from a same fragment of DNA based on the barcodes of the respective group of sequences corresponding to a same aliquot, the same fragment including at least a portion of the first chromosomal region;for each genomic position of a plurality of genomic positions within the first chromosomal region: aggregating a number of instances that an aligned sequence of the respective group includes the genomic position;comparing the histograms to identify a common increase or decrease in the histograms within a same window of the first chromosomal region as a location of the structural variation in the first chromosomal region of the genome of the organism.
地址 Mountain View CA US