发明名称 FUSION OF CLUSTER LABELING ALGORITHMS BY ANALYZING SUB-CLUSTERS
摘要 According to some embodiments of the present invention there is provided a computerized method for labeling a cluster of text documents. The method comprises receiving a document cluster and producing automatically multiple document sub-clusters determined by randomly changing some documents. The method applies multiple cluster labeling algorithms on the cluster and on each sub-cluster, to generate ordered lists. The method comprises generating a ranked label list for each cluster labeling algorithm by computing automatically label values, one for each cluster label in the lists of the selected algorithm, and re-ranking the ordered list. The method combines the re-ranked label lists using a label fusing algorithm to produce a fused label list.
申请公布号 US2016217201(A1) 申请公布日期 2016.07.28
申请号 US201514607092 申请日期 2015.01.28
申请人 International Business Machines Corporation 发明人 Hummel Shay;Roitman Haggai;Shmueli-Scheuer Michal
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computerized method for labeling a cluster of text documents, comprising: receiving a document cluster comprising a plurality of text documents; producing automatically a plurality of document sub-clusters, wherein each of said plurality of document sub-clusters is determined by randomly changing at least one document of said plurality of text documents from said document cluster; applying automatically a plurality of cluster labeling algorithms on said document cluster and each of said plurality of document sub-clusters, to generate a plurality of ordered lists; generating a plurality of ranked label lists one for each of said plurality of cluster labeling algorithms, by performing the actions of: selecting automatically a selected algorithm from one of said plurality of cluster labeling algorithms;computing automatically a plurality of label values, one for each of said plurality of cluster labels in said plurality of ordered lists for said selected algorithm;generating one of said plurality of ranked label lists corresponding to said selected algorithm, wherein said one ranked label list is computed from respective said plurality of ordered lists and respective said label value for each of respective said plurality of cluster labels; and combining said plurality of ranked label lists using a label fusing algorithm to produce a fused label list.
地址 Armonk NY US
您可能感兴趣的专利