发明名称 METHODS AND COMPUTER-PROGRAM PRODUCTS FOR ORGANIZING ELECTRONIC DOCUMENTS
摘要 Methods of organizing documents by reclassification and clustering are disclosed. In one embodiment, a method of clustering electronic documents of a document corpus includes comparing, by a computer, each individual electronic document in the document corpus with each other electronic document in the document corpus, thereby forming document pairs. The electronic documents of the document pairs are compared by calculating a similarity value with respect to the electronic documents of a document pair, associating the similarity value with both electronic documents of the document pair, and applying a clustering algorithm to the document corpus using the similarity values to create a plurality of hierarchical clusters. The similarity value is based on a plurality of attributes of the electronic documents in the document corpus. The plurality of attributes includes a citation attribute, a text-based attribute and one or more of an author-attribute, a publication-attribute, an institution-attribute, a downloads-attribute, and a clustering-results-attribute.
申请公布号 EP3134831(A2) 申请公布日期 2017.03.01
申请号 EP20150787281 申请日期 2015.04.22
申请人 Elsevier B.V. 发明人 SCHIJVENAARS, Bob;DOORNENBAL, Marius
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址