发明名称 Method and apparatus for converting bitmap image documents to editable coded data using a standard notation to record document recognition ambiguities.
摘要 <p>Documents represented as bitmap images (S100)are transformed into coded textual data (S120) and coded graphics data (S160) by graphics and textual recognizers, which use a standard notation for recording the results of the document recognition processes, including any ambiguities, in a document description language. Recognized portions of the document, represented as editable coded data, such as for example ASCII, are placed in elements, defined in the document description language, with all contents of an element sharing some common characteristic. Elements can include, for example: character-string-elements (S140), questionable-character-elements (S150), questionable-word-elements, verified-word-elements, alternative-word-elements, segment- elements, and arc-elements. Each element includes editable coded data, which also includes uncertainty information (S155) identifying any coded data which was not transformed with a predetermined level of confidence. &lt;IMAGE&gt;</p>
申请公布号 EP0549329(A2) 申请公布日期 1993.06.30
申请号 EP19920311711 申请日期 1992.12.22
申请人 XEROX CORPORATION 发明人 DE LA BEAUJARDIERE, JEAN-MARIE R.
分类号 G06K9/03;G06K9/20;G06K9/72;G06T1/00 主分类号 G06K9/03
代理机构 代理人
主权项
地址