发明名称 |
Automatic identification of fields and labels in forms |
摘要 |
A system and method for processing form images including strokes. A controller receives a plurality of form images including a plurality of strokes. A stroke identification module identifies the position of each stroke in each of the form images. A geometry engine generates an overlay of the plurality of form images and identifies a group of overlapping strokes from the overlay. The geometry engine generates a field bounding box encompassing the group of strokes, the field bounding box representing a field in the plurality of form images. The geometry engine crops a field image from each form image based on the size and position of the field bounding box. A label detector analyzes an area around the field image in the form image to determine a label and generates a label image. |
申请公布号 |
US9058516(B2) |
申请公布日期 |
2015.06.16 |
申请号 |
US201213415766 |
申请日期 |
2012.03.08 |
申请人 |
Ricoh Company, Ltd. |
发明人 |
Barrus John W. |
分类号 |
G06F17/22;G06K9/00;G06F17/24 |
主分类号 |
G06F17/22 |
代理机构 |
Patent Law Works LLP |
代理人 |
Patent Law Works LLP |
主权项 |
1. A computer-implemented method for generating symbolic information for a first set of field images associated with a first field, the method comprising:
receiving the first set of field images associated with the first field, the first set of field images associated with the first field cropped from a plurality of form images; retrieving a first label image associated with the first field, the first label image associated with the first field cropped from one of the plurality of form images; determining a match for the first label image from a classification dictionary; associating, with one or more processors, symbolic information corresponding to the match for the first label image with the first label image; determining a subject matter associated with the first label image using the symbolic information associated with the first label image; identifying a subset of the classification dictionary using the subject matter associated with the first label image; determining a match for each field image of the first set of field images from the subset of the classification dictionary; and associating symbolic information corresponding to the match for each field image with each corresponding field image of the first set of field images. |
地址 |
Tokyo JP |