发明名称 |
Computer Files and Methods Supporting Forensic Analysis of Nucleotide Sequence Data |
摘要 |
In one illustrative embodiment, a method may comprise receiving a first text-based computer file including one or more records, each of the one or more records comprising nucleotide sequence data generated by a read of a massively parallel sequencing (MPS) instrument, determining whether a portion of the nucleotide sequence data of each record represents a short tandem repeat (STR) associated with a locus, placing each portion of the nucleotide sequence data determined to represent an STR associated with a locus into one of a number of locus-specific lists, determining a number of occurrences within each locus-specific list of identical nucleotide sequence data representing a unique STR, and generating a second text-based computer file including one or more records, each of the one or more records corresponding to a unique STR for which the number of occurrences of identical nucleotide sequence data representing the unique STR exceeded an abundance threshold. |
申请公布号 |
US2014278127(A1) |
申请公布日期 |
2014.09.18 |
申请号 |
US201313834830 |
申请日期 |
2013.03.15 |
申请人 |
BATTELLE MEMORIAL INSTITUTE |
发明人 |
Young Brian A.;Minard-Smith Angela T.;Heizer, JR. Esley M. |
分类号 |
G06F19/22 |
主分类号 |
G06F19/22 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method comprising:
receiving a first text-based computer file including one or more records, each of the one or more records comprising nucleotide sequence data generated by a read of a massively parallel sequencing (MPS) instrument; determining, for each of the one or more records of the first text-based file, whether a portion of the nucleotide sequence data of the record represents a short tandem repeat (STR) associated with a locus; placing each portion of the nucleotide sequence data determined to represent an STR associated with a locus into one of a number of locus-specific lists; determining, for each of the locus-specific lists, a number of occurrences within the locus-specific list of identical nucleotide sequence data representing a unique STR; and generating a second text-based computer file including one or more records, each of the one or more records corresponding to a unique STR for which the number of occurrences of identical nucleotide sequence data representing the unique STR exceeded an abundance threshold. |
地址 |
Columbus OH US |