发明名称 METHODS AND SYSTEMS FOR ALIGNING SEQUENCES
摘要 The invention includes methods for aligning reads (e.g., nucleic acid reads, amino acid reads) to a reference sequence construct, methods for building the reference sequence construct, and systems that use the alignment methods and constructs to produce sequences. The method is scalable, and can be used to align millions of reads to a construct thousands of bases or amino acids long. The invention additionally includes methods for identifying a disease or a genotype based upon alignment of nucleic acid reads to a location in the construct.
申请公布号 US2015057946(A1) 申请公布日期 2015.02.26
申请号 US201314016833 申请日期 2013.09.03
申请人 SEVEN BRIDGES GENOMICS INC. 发明人 Kural Deniz
分类号 G06F19/22 主分类号 G06F19/22
代理机构 代理人
主权项 1. A system for assembling a plurality of sequence reads comprising a processor and non-transitory memory, wherein the memory comprises instructions that, when executed, cause the processor to: obtain a plurality of sequence reads as strings of symbols, wherein each of the plurality of sequence reads is from the same subject; compare each string of symbols corresponding to a sequence read to a plurality of positions in a reference sequence construct, wherein the construct comprises at least two different strings of symbols at multiple positions in the construct, score overlaps between each string of symbols corresponding to a sequence read and each of the plurality of positions in the reference sequence construct, wherein a higher score corresponds to a greater amount of overlap; identify the overlap corresponding to the highest score for each sequence read; assign each sequence read to a location on the construct corresponding to the highest score; assemble the plurality of sequence reads into an assembled sequence based upon the assigned location of each sequence read; and write a file to memory corresponding to the assembled sequence for the subject.
地址 Cambridge MA US