主权项 |
1. A method for estimating the effectiveness of information retrieval for electronic discovery comprising:
calculating, by at least one computer processor configured to operate in an information retrieval system, a plurality of statistics for a plurality of test documents, wherein the plurality of statistics for the plurality of test documents comprises a number of documents that are false negatives in the plurality of test documents; calculating, by the at least one computer processor, the number of false negatives for a corpus of documents based on one or more of a number of test documents in the plurality of test documents, a size of the corpus of documents, a predetermined confidence level, and the number of false negatives in the plurality of test documents, wherein classification of a document of the corpus of documents is a false negative if classification of the document by a classification model is negative and classification of the document by a user is positive; and calculating, by the at least one computer processor, an effectiveness of the information retrieval system on the corpus of documents based on the number of false negatives for the corpus of documents. |