发明名称 BUILDING CLASSIFICATION AND EXTRACTION MODELS BASED ON ELECTRONIC FORMS
摘要 According to one embodiment, a computer-implemented method is configured for building a classification and/or data extraction knowledge base using an electronic form. The method includes: receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using: the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. Corresponding systems and computer program products are also disclosed.
申请公布号 US2017109610(A1) 申请公布日期 2017.04.20
申请号 US201615396322 申请日期 2016.12.30
申请人 Kofax, Inc. 发明人 Macciola Anthony;Amtrup Jan W.;Thompson Stephen Michael
分类号 G06K9/62;G06K9/00 主分类号 G06K9/62
代理机构 代理人
主权项 1. A computer-implemented method for building a classification and/or data extraction knowledge base using an electronic form, the method comprising: receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using: the representation of the electronic form, and the plurality of permutations of the representation of the electronic form.
地址 Irvine CA US