发明名称 |
APPARATUS AND METHOD FOR EXTRACTING SIGNIFICANT TERMS WITHIN A DOCUMENT |
摘要 |
A device and a method for extracting significant terms from a document are provided to extract the significant terms of the document by using PCA(Principal Component Analysis) and SVD(Single Value Decomposition) without using information tools like a large capacity corpus or thesaurus. A matrix generator(11) generates a multidimensional space using each terms of the document as coordinate axes and generates a sentence-term matrix corresponding to the multidimensional space. An SVD part(12) removes noise by performing the SVD for the generated sentence-term matrix. A PCA part(13) performs the PCA for the noise removed sentence-term matrix, represents each principal component selected as a result with an eigenvector coefficient, and quantizes each principal component to an eigenvalue. A significant term extractor extracts the significant term from the principal components by considering an accumulation ratio and a principal component loading coefficient of the eigenvalue corresponding to each quantized principal component.
|
申请公布号 |
KR20070006367(A) |
申请公布日期 |
2007.01.11 |
申请号 |
KR20050061647 |
申请日期 |
2005.07.08 |
申请人 |
UNIVERSITY OF ULSAN FOUNDATION FOR INDUSTRY COOPERATION |
发明人 |
LEE, CHANG BEOM;CHOI, HO SEOP;OCK, CHEOL YOUNG;PARK, HYUK RO |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|