发明名称 Electronic document classification apparatus
摘要 The apparatus computes classification scores based on parameters that have been determined from documents. Each score is compared with a first and second threshold. Definite classifications are assigned when the score is above the highest threshold or below the lowest threshold and the documents are processed accordingly. If the score is between the thresholds the document is singled out for further inspection, for example by a human arbitrator, to assign a class. The first and second threshold are adapted automatically based on specified a minimum accuracy level for the classification and a training set. The apparatus uses this specified accuracy in a search for a combination of threshold values that optimizes classifier yield, in terms of a maximized fraction of patterns in a training set that need not be turned over for further inspection without definite classification. The search is subject to the condition that the combination of thresholds results in at least the specified accuracy over the training set.
申请公布号 US8554715(B2) 申请公布日期 2013.10.08
申请号 US200913125701 申请日期 2009.10.29
申请人 KRAAIJ WESSEL;RAAIJMAKERS STEPHAN ALEXANDER;NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETENSCHAPPELIJK ONDERZOEK TNO 发明人 KRAAIJ WESSEL;RAAIJMAKERS STEPHAN ALEXANDER
分类号 G06F5/00;G06N5/00 主分类号 G06F5/00
代理机构 代理人
主权项
地址