发明名称 |
A METHOD AND SYSTEM FOR ANALYSING DATA SEQUENCES |
摘要 |
<p>A sequencing system and method of generating index keys for one or more data sequence based on masked values of reads from a sample data sequence and/or one or more template data sequence. Each index key value may be based upon a concatenated form of each extracted value, although other transformations may be employed. A number of different masks may be applied to the data sequence at a number of locations. At least some of the masks may include indels and/or substitutions. The masks may be manually or computer generated. The data sequence may be one or more reference templates and/or one or more sample sequences, such as DNA or RNA sequences. Sample data may be stored in the one or more index by correlating masked values of reads with index key values and storing an identifier for each read in association with a corresponding index key value. Sample data sequences may be evaluated by comparing sample sequence and template sequences having the same index key value and determining scores for the reads based on the comparison and associating the scores with the reads. Reads may be rejected based upon the comparison. A read may be rejected if there is more than one position at which it has a best score. A read may be rejected if its score falls below a threshold score level.</p> |
申请公布号 |
WO2010056131(A1) |
申请公布日期 |
2010.05.20 |
申请号 |
WO2009NZ00245 |
申请日期 |
2009.11.13 |
申请人 |
REAL TIME GENOMICS, INC.;CLEARY, JOHN, GERALD |
发明人 |
CLEARY, JOHN, GERALD |
分类号 |
C12Q1/68;G06F19/22 |
主分类号 |
C12Q1/68 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|