摘要 |
<p>In an electronic document processing a processor defines clusters containing documents from the set of electronic documents. The processor selects an update subset of said electronic documents, limiting a content of the update subset to at most a first predetermined number of electronic documents. A reference subset of said electronic documents is selected, limiting a content of the reference subset to at most a second predetermined number of electronic documents, the reference subset properly containing the update subset. The electronic documents are assigned or re-assigned to the clusters from only the update subset, on the basis of similarity of the electronic documents to respective ones of the clusters computed from a measure of similarity between the electronic documents in the update subset and individual electronic documents from each respective one of the clusters, limited to individual electronic documents that belong to the reference subset.</p> |