发明名称 SYSTEMS AND METHODS FOR TRAINING DOCUMENT ANALYSIS SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM DOCUMENTS
摘要 A method of training a document analysis system to extract data from documents is provided. The method includes: automatically analyzing images and text features extracted from a document to associate the document with a corresponding document category; comparing the extracted text features with a set of text features associated with corresponding category of the document, in which the set of text features includes a set of characters, words, and phrases; if the extracted features are found to consist of the characters, words, and phrases belonging to the set of text features associated with the corresponding document category, storing the extracted text features as the data contained in the corresponding document; and, if the extracted text features are found to include at least one text feature that does not belong to the set of text features associated with the corresponding document category, submitting the unrecognized text features to a training phase.
申请公布号 US2011258150(A1) 申请公布日期 2011.10.20
申请号 US201113007430 申请日期 2011.01.14
申请人 COPANION, INC. 发明人 NEOGI DEPANKAR;LADD STEVEN K.;WELLING GIRISH;KUMAR ARJUN;SINGH VARTIKA;DUGGAN MATTHEW;MAHATA TUSHAR;YANG XIAOBIN;XU JIAN-WU;O'NEIL JANICE;SARKAR NIRUPAM;KRISHNA GOPAL
分类号 G06F15/18 主分类号 G06F15/18
代理机构 代理人
主权项
地址