发明名称 NONLINEAR SYSTEM IDENTIFICATION FOR CLASS PREDICTION IN BIOINFORMATICS AND RELATED APPLICATIONS
摘要 Many of the current procedures for detecting coding regions on human DN A sequences actually rely on a combination of a number of individual technique s such as discriminant analysis and neural net methods. A recent paper introduced the notion of using techniques from nonlinear systems identification as one means for classifyin g protein sequences into their structure/function groups. The particular technique employed, called parallel cascade identification, achieved sufficiently high correct classification rates to suggest it could be usefully combined with current protein classification schemes to enhance overall accuracy. In the present paper, parallel cascade identification is used in a pilot study to distinguish coding (exon) from noncoding (intron) human DNA sequences. Only the first exon, and first intron, sequence with known boundaries in genomic DNA from the .beta. T-cell recepto r locus were used for training. Then, the parallel cascade classifiers were able to achieve classification rates of about 89% on novel sequences in a test set, and averaged about 82% when results of a blind test were included. These results indicate that parallel cascade classifiers may be useful components in future coding region detection programs. Key Words: Nonlinear Systems, Identification, Exons, Introns, DNA Sequence s.
申请公布号 CA2325225(A1) 申请公布日期 2002.05.03
申请号 CA20002325225 申请日期 2000.11.20
申请人 KORENBERG, MICHAEL J. 发明人 KORENBERG, MICHAEL J.
分类号 G06F19/00;(IPC1-7):G06F17/60 主分类号 G06F19/00
代理机构 代理人
主权项
地址