发明名称 |
METHOD OF CLUSTERING LITERATURE IN MULTIPLE LANGUAGES |
摘要 |
The present invention relates to a technical field of information retrieval, and more particularly to a clustering method for multilingual documents, comprising steps of: step 1: establishing a similar words bank comprising multilingual words; step 2: extracting eight eigenvalues; step3: calculating a similarity of any two documents i and j ; step 4: selecting accumulation points from a set of the documents to establish a cluster; step 5: adding residual documents which are not selected in the set to the cluster; and step 6: disposing the cluster in a circular ring structure. The method of the present invention does not limit categories of languages in the documents, the accumulation points are selected according to judgments of similarity to establish clusters and classify multilingual documents in the clusters. The method of the present invention is suitable for clustering multilingual documents. |
申请公布号 |
EP2876561(A1) |
申请公布日期 |
2015.05.27 |
申请号 |
EP20130886161 |
申请日期 |
2013.09.16 |
申请人 |
GUANGDONG ELECTRONICS INDUSTRY INSTITUTE LTD. |
发明人 |
YUAN, ZIMU;PENG, PENG;JI, TONGKAI;YUE, QIANG |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|