摘要 |
Disclosed herein is a method for comparing documents. The method includes the steps of: determining a plurality of similarity measures; and determining an overall similarity measure for the plurality of documents, based on the plurality of similarity measures. In one embodiment, the similarity measures are chosen from the group of similarity measures consisting of semantic and reference similarity measures. When comparing documents from the chemical, biochemical or pharmaceutical domains, the determination of the similarity utilizes a determination of structural similarity of the chemical formulas described in the plurality of documents.
|