发明名称 Method and system for identifying anchors for fields using optical character recognition data
摘要 Identifying anchors for fields using optical character recognition data is described. A collection of characters is identified. The collection of characters includes a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document. The first set of characters is associated with a first word, and the second set of characters is associated with a second word. An anchor is created based on the collection of characters, wherein the anchor is at a third relative position to the first field in the first document. A second field is identified in a second document by identifying the anchor in the second document.
申请公布号 US9396540(B1) 申请公布日期 2016.07.19
申请号 US201313855933 申请日期 2013.04.03
申请人 EMC CORPORATION 发明人 Sampson Steven
分类号 G06F17/30;G06T7/00 主分类号 G06F17/30
代理机构 Dergosits & Noah LLP 代理人 Dergosits & Noah LLP ;Noah Todd A.
主权项 1. A system for identifying anchors for fields using optical character recognition data, the system comprising: one or more processors; and a non-transitory computer readable medium storing a plurality of instructions, which when executed, cause the one or more processors to: identify a first collection of characters comprising a first set of characters at a first position relative to a first field in a first document and a second set of characters at a second position relative to the first field in the first document, wherein the first set of characters is associated with a first word and the second set of characters is associated with a second word;create a first anchor in the first document based on the first collection of characters, wherein the first anchor is at a third position relative to the first field in the first document, and wherein the first anchor is associated with a second field in the first document;identify a second collection of characters comprising a third set of characters at a fourth position relative to a third field in a second document and a fourth set of characters at a fifth position relative to the third field in the second document, wherein the third set of characters is associated with a third word and the fourth set of characters is associated with a fourth word;determine a location of a second anchor in the second document by calculating a vector based on the first, second, third and fourth sets of characters; andidentify a fourth field in the second document that corresponds to the second field in the first document based on the location of the second anchor in the second document.
地址 Hopkinton MA US