摘要 |
<p>A method and document separation system for separating a set of related documents is described. In one aspect, the method comprises: determining, on a document selection system, quality scores for a plurality of the documents in the set of related documents; obtaining a similarity score for a plurality of pairs of documents in the set of related document; and on a document selection system, obtaining a first subset of related documents which solves an optimization problem, the first subset of related documents including a portion of the document in the set of related documents, the optimization problem being a function of one or more quality scores of the documents assigned to the first subset of related documents and one or more similarity scores of pairs of documents assigned to the first subset of related documents.</p> |