发明名称 Methods for identifying nucleic acid polymorphisms
摘要 The invention provides an automated method of identifying a plurality of different polymorphisms within two or more related nucleic acid sequences. The method consists of: (a) obtaining a data set comprising a nucleic acid sequence assembly and a plurality of sequence characteristic parameters associated with said assembly; (b) indexing said nucleic acid assembly and said plurality of sequence characteristic parameters in a database; (c) selecting a region of said nucleic acid assembly having sequence characteristic parameters indicative of a polymorphic sequence, and (d) displaying two or more nucleic acid sequences of said region, said two or more sequences identifying different polymorphisms within said nucleic acid assembly. Also provided is a method of identifying a nucleic acid containing an indel region within a set of related nucleic acid sequences. The method consists of comprising: (a) dentifying a nucleic acid within two or more related nucleic acid sequences suspected of containing an indel region, said nucleic acid containing one or more regions having a plurality of polymorphisms, and (b) determining the occurrence of two or more criteria indicating the presence of an indel region associated with said one or more regions having a plurality of polymorphisms, said occurrence characterizing said nucleic acid as containing an indel region. Further provides is a method of determining the sequence of an allele containing an indel region within a set of related nucleic acid sequences. The method consists of comprising: (a) identifying a nucleic acid containing an indel region within two or more related nucleic acid sequences; (b) generating a consensus sequence within said indel region for said two or more related nucleic acid sequences; (c) identifying a matching string to said consensus sequence within at least one of said two or more related nucleic acid sequences, and (d) subtracting said consensus sequence from said two or more related nucleic acid sequences, the presence or absence of a unique sequence in one of said related nucleic acid sequences indicating the presence of an actual indel region. The invention additionally provides an automated system for identifying a plurality of different polymorphisms within two or more related nucleic acid sequences. The system consists of: (a) a sample submission module capable of transmitting data; (b) a core statistics loading and post processing module containing sequence characteristic parameters; (c) an assembly module capable constructing sequence assemblies from sequence database extracted data; (d) a SNP prospector module capable of identifying polymorphisms; (e) a polymorphism loader submodule capable of parsing polymorphic region sequence and sequence characteristic parameters from sequence assemblies; (f) a SNP database structured to contain the information produced in steps (a) through (e), and (g) an output module for display or further manipulation of specified data in step (f).
申请公布号 US2003211504(A1) 申请公布日期 2003.11.13
申请号 US20020268058 申请日期 2002.10.09
申请人 FECHTEL KIM;PRABHAKAR SHASHI;HUANG HUI;FITZGERALD MICHAEL G.;PRESCOTT-ROY JOANN;RUNGE MICHELLE;WANG HUAJUN;GIBSON RENE LEE 发明人 FECHTEL KIM;PRABHAKAR SHASHI;HUANG HUI;FITZGERALD MICHAEL G.;PRESCOTT-ROY JOANN;RUNGE MICHELLE;WANG HUAJUN;GIBSON RENE LEE
分类号 C12Q1/68;G06F19/00;(IPC1-7):C12Q1/68;G01N33/48;G01N33/50 主分类号 C12Q1/68
代理机构 代理人
主权项
地址