发明名称 METHOD AND SYSTEM FOR DOCUMENT DATA EXTRACTION TEMPLATE MANAGEMENT
摘要 User acceptance of a given data extraction template and the number of data fields that the data extraction template can extract accurately is used to calculate data extraction template ranking, or a ranking score, to be associated with the data extraction template. Then the data extraction template having the highest data extraction template ranking score is used in a first attempt to extract data from a source documents of the source document type associated with the data extraction templates. As more data extraction templates associated with a given source document type are received, data extraction template ranking scores are updated/modified, and, in one example, the data extraction templates having the lowest data extraction template ranking scores are detected/eliminated.
申请公布号 US2015127659(A1) 申请公布日期 2015.05.07
申请号 US201314069795 申请日期 2013.11.01
申请人 Intuit Inc. 发明人 Madhani Sunil;Sreepathy Anu;Shenoy Mithun U.
分类号 G06F17/30;G06K9/00 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computing system implemented method for document data extraction template management comprising the following, which when executed individually or collectively by any set of one or more processors perform a process including: receiving data extraction template data representing a data extraction template associated with a specific source document type; determining a field hit count number associated with the data extraction template, the field hit count number indicating the number of data fields from which data can be extracted from the specific source document type using the data extraction template; using the data extraction template to extract data from received source documents of the specific source document type; monitoring the acceptance or rejection of data extracted from received source documents of the specific source document type using the data extraction template; determining a data acceptance count to be associated with the data extraction template, the data acceptance count indicating the number of times the data extracted from received source documents of the specific source document type using the data extraction template is accepted; transforming the field hit count number associated with the data extraction template and the data acceptance count associated with the data extraction template into data extraction template ranking score data for the data extraction template; saving the data extraction template data and the data extraction template ranking score data for the data extraction template as ranked data extraction template data; and aggregating ranked data extraction template data associated with two or more data extraction templates associated with the specific source document type.
地址 Mountain View CA US