发明名称 SYSTEM AND METHOD FOR AUTOMATICALLY CLASSIFYING DOCUMENTS
摘要 <p>A system and method for automatically classifying documents using an annotated topic tree is provided. A set of topics may be extracted from a document corpus such that each document in the document corpus is associated with a topic model. A sample set of documents may be selected from the document corpus during a current sampling round. The topic models associated with the sample set of documents may be annotated by human reviewers with coding information. An annotated topic tree may be formed based on the annotated topic model. One or more machine learning algorithms may be used to project the information in the annotated topic tree to the rest of the document corpus. A voting algorithm which may comprise a plurality of machine learning algorithms may also be used to project the sampling judgments to the rest of the document corpus.</p>
申请公布号 WO2014120835(A1) 申请公布日期 2014.08.07
申请号 WO2014US13683 申请日期 2014.01.29
申请人 ERNST & YOUNG LLP 发明人 OEHRLE, RICHARD THOMAS;JOHNSON, ERIC ALLEN;BOTHRA, ARPIT;BRENIER, JASON M.;CUENI, ANNA BARBARA;MORLEY, ERIC ABEL
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址