发明名称 Adaptive OCR for books
摘要 A system/method is presented for scanning entire books or document all at once using an adaptive process where the book or document has known fonts and unknown fonts. The known fonts are processed through a verification system where sure words and error words are determined. Both the sure words and error words are sent to OCR training where they are re-OCR'ed and repeatedly verified until they meet a predetermined quality criteria. Characters or word not meeting the predetermined quality criteria receive additional OCR training until all the characters and words pass the predetermined quality criteria. Unknown fonts are scanned and clustered together by shape. Outliers in the shapes are manually key-in. Those symbols that are manually classified go to OCR training and then to the known type optimization process.
申请公布号 US7480411(B1) 申请公布日期 2009.01.20
申请号 US20080040946 申请日期 2008.03.03
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 TZADOK ASAF;WALACH EUGENIUSZ
分类号 G06K9/18 主分类号 G06K9/18
代理机构 代理人
主权项
地址