摘要 |
One embodiment of the present invention provides a system for estimating document similarity. During operation, the system selects a collection of documents which includes a first set of passages, constructs a passage-sequence model based on the first set of passages, receives a new document which includes a second set of passages, and determines a sequence of operations associated with the new document in relation to the collection of documents based on the constructed passage-sequence model. |