发明名称 DETERMINING A TEXT STRING BASED ON VISUAL FEATURES OF A SHRED
摘要 A shred is digital data that includes an image of a portion of a document, such as a field of a form. Optical Character Recognition (OCR) is traditionally used to convert images of text into textual content. However, OCR engines are often not sufficiently capable to convert images of handwritten text into textual content. In a disclosed technique, a library of shreds is created where each shred is manually associated with a character string that represents the textual content of the shred. A computer extracts visual features of a new shred that includes an image of a handwritten text. Based on the visual features, and without performing OCR, the computer identifies a shred from the library of shreds that is visually similar to the new shred, and determines that the character string associated with the library shred accurately represents the textual content of the new shred.
申请公布号 US2017076152(A1) 申请公布日期 2017.03.16
申请号 US201615264419 申请日期 2016.09.13
申请人 Captricity, Inc. 发明人 Asl Ehsan Hosseini;Guha Angshuman
分类号 G06K9/00;G06K9/46;G06K9/62 主分类号 G06K9/00
代理机构 代理人
主权项 1. A method for determining a character string that represents textual content of a hand-written image of the character string without executing an optical character recognition engine, the method comprising: generating a library that includes a digital image of each of a plurality of hand-written character strings by: storing, by a computing system at a storage device, the digital images of the plurality of hand-written character strings;associating, by the computing system via a database, each of the digital images with a manually determined character string that represents textual content of the digital image; andfor each of the digital images: determining, by the computing system executing a visual feature extractor, a plurality of visual features based on, and associating the plurality of visual features with, each of the digital images, wherein the digital images include a particular digital image associated via the database with a particular plurality of visual features determined based on the particular digital image, and associated via the database with a particular character string that represents textual content of the particular digital image; determining which of the manually determined character strings to associate with a first digital image of a first hand-written character string by: receiving, by the computing system, the first digital image,determining, by the computing system executing the visual feature extractor, a first plurality of visual features based on the first digital image, andassociating, by the computing system, the first digital image with the particular character string based on the first plurality of visual features and the particular plurality of visual features.
地址 Oakland CA US