发明名称 CLASSIFICATION OF NUCLEOTIDE SEQUENCES BY LATENT SEMANTIC ANALYSIS
摘要 DNA sequences are analyzed using latent semantic analysis. A set of nucleotide sequences is received in which the set has a first number of sequences. A set of basis vectors is determined, in which the set has a second number of basis vectors, the second number being smaller than the first number. Each basis vector represents a specific combination of predetermined nucleotide segments. For each of the nucleotide sequences, an approximate representation of the nucleotide sequence is determined based on a combination of the basis vectors. For each pair of nucleotide sequences, a distance between the pair of nucleotide sequences is determined according the distance between the approximate representation of the pair of nucleotide sequences. The set of nucleotide sequences are classified based on the distances between the pairs of nucleotide sequences.
申请公布号 US2014121985(A1) 申请公布日期 2014.05.01
申请号 US201313954925 申请日期 2013.07.30
申请人 SAYOOD KHALID;WAY SAM;NALBANTOGLU OZKAN UFUK;GARRITY GEORGE 发明人 SAYOOD KHALID;WAY SAM;NALBANTOGLU OZKAN UFUK;GARRITY GEORGE
分类号 G06F19/22 主分类号 G06F19/22
代理机构 代理人
主权项
地址