摘要 |
Methods of organizing documents by reclassification and clustering are disclosed. In one embodiment, a method of clustering electronic documents of a document corpus includes comparing, by a computer, each individual electronic document in the document corpus with each other electronic document in the document corpus, thereby forming document pairs. The electronic documents of the document pairs are compared by calculating a similarity value with respect to the electronic documents of a document pair, associating the similarity value with both electronic documents of the document pair, and applying a clustering algorithm to the document corpus using the similarity values to create a plurality of hierarchical clusters. The similarity value is based on a plurality of attributes of the electronic documents in the document corpus. The plurality of attributes includes a citation attribute, a text-based attribute and one or more of an author-attribute, a publication-attribute, an institution-attribute, a downloads-attribute, and a clustering-results-attribute. |