摘要 |
Our method assigns importance ranks to documents within repositories or databases. The method uses a corpus of indexed documents that has been annotated to the terms of one or more ontologies in order to assign a semantic similarity score to queries based on terms taken from the ontologies. A statistical model is used to test the significance of matches between query terms and documents or categories. A means of determining the distribution of P-values for similarity scores is needed for our method. One such method developed by us results in an acceleration of over 10, 000-fold for realistic queries and ontologies, and makes it practicable to calculate P-values dynamically or to keep database annotations and the related P-value distributions up to date by frequent recalculation. This method is particularly useful in enhancing the performance of search engine results for databases that have been annotated by one or more ontologies and can be used to index and search in collections of books, web pages, company documents, and similar databases. |
申请人 |
CHARITE-UNIVERSITAETSMEDIZIN BERLIN;MAX-PLANCK-GESELLSCHAFT ZUR FOERDERUNG DER WISSENSCHAFTEN E. V.;ROBINSON, PETER, N.;SCHULZ, MARCEL, H.;BAUER, SEBASTIAN;KOEHLER, SEBASTIAN |
发明人 |
ROBINSON, PETER, N.;SCHULZ, MARCEL, H.;BAUER, SEBASTIAN;KOEHLER, SEBASTIAN |