发明名称 |
BUILDING CLASSIFICATION AND EXTRACTION MODELS BASED ON ELECTRONIC FORMS |
摘要 |
According to one embodiment, a computer-implemented method is configured for building a classification and/or data extraction knowledge base using an electronic form. The method includes: receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using: the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. Corresponding systems and computer program products are also disclosed. |
申请公布号 |
US2017109610(A1) |
申请公布日期 |
2017.04.20 |
申请号 |
US201615396322 |
申请日期 |
2016.12.30 |
申请人 |
Kofax, Inc. |
发明人 |
Macciola Anthony;Amtrup Jan W.;Thompson Stephen Michael |
分类号 |
G06K9/62;G06K9/00 |
主分类号 |
G06K9/62 |
代理机构 |
|
代理人 |
|
主权项 |
1. A computer-implemented method for building a classification and/or data extraction knowledge base using an electronic form, the method comprising:
receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using: the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. |
地址 |
Irvine CA US |