发明名称 METHOD OF CLUSTERING LITERATURE IN MULTIPLE LANGUAGES
摘要 The present invention relates to a technical field of information retrieval, and more particularly to a clustering method for multilingual documents, comprising steps of: step 1: establishing a similar words bank comprising multilingual words; step 2: extracting eight eigenvalues; step3: calculating a similarity of any two documents i and j ; step 4: selecting accumulation points from a set of the documents to establish a cluster; step 5: adding residual documents which are not selected in the set to the cluster; and step 6: disposing the cluster in a circular ring structure. The method of the present invention does not limit categories of languages in the documents, the accumulation points are selected according to judgments of similarity to establish clusters and classify multilingual documents in the clusters. The method of the present invention is suitable for clustering multilingual documents.
申请公布号 EP2876561(A1) 申请公布日期 2015.05.27
申请号 EP20130886161 申请日期 2013.09.16
申请人 GUANGDONG ELECTRONICS INDUSTRY INSTITUTE LTD. 发明人 YUAN, ZIMU;PENG, PENG;JI, TONGKAI;YUE, QIANG
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址