发明名称 MULTI-SAMPLE DIFFERENTIAL VARIATION DETECTION
摘要 DNA assembly techniques for a DNA dataset comprised of DNA sequence reads make use of anchor points identified using a reference DNA sequence. Because the anchor point technique is dependent on a high accuracy dataset, related techniques to detect erroneous reads and to correct erroneous reads making use of k-Mer and statistical techniques are also disclosed. Upon preparing a high accuracy dataset, a read overlap graph is generated that removes exact matches with respect to the reference DNA sequence, thereby leaving behind potential structural variants. Using anchor points representing closed matches to the reference DNA dataset, the read overlap graph is traversed to detect potential structural variants. The structural variants are then validated. Use cases for anchor assembly and related techniques, including multi-sample differential variant detection are also disclosed.
申请公布号 US2016246921(A1) 申请公布日期 2016.08.25
申请号 US201514631791 申请日期 2015.02.25
申请人 Spiral Genetics, Inc. 发明人 Bruestle Jeremy Joseph;Drees Becky L.
分类号 G06F19/22;G06F19/16 主分类号 G06F19/22
代理机构 代理人
主权项 1. A method to detect a variation existing in a target DNA dataset but not existing in a subtraction DNA dataset, the method comprising: receiving the target DNA dataset comprising a set of reads from a target; generating a set of k-Mers from the reads in the received target DNA dataset; receiving the subtraction DNA dataset comprising a set of reads; generating a set of k-Mers from the reads in the received subtraction DNA dataset; detecting at least one structural variant from the k-Mers generated from the target DNA dataset; generating a set of k-Mers from the at least one structural variant; for each k-Mer generated from the at least one structural variant, determining whether the k-Mer is in the set of k-Mers generated from the subtraction DNA dataset; and upon determining that at least one k-Mer generated from the at least one structural variant is not in the set of k-Mers generated from the subtraction DNA dataset structural variant, reporting the structural variant is not in the subtraction DNA dataset.
地址 Seattle WA US