摘要 |
For each document in a document set, entities are identified and a set of association rules, based on appearance of the entities in the paragraphs of the documents in the set, are derived. Documents are clustered based on the association rules. As documents are added to the clusters, additional association rules specific to the clusters can optionally be derived as well. |