发明名称 Proof reading of text data generated through optical character recognition
摘要 A system includes preparing respective proof reading tools for performing carpet proof reading and side-by-side proof reading of text data, recording a log of time to perform proof reading operations by using the first and second proof reading tools. The method further includes estimating, based on times stored in a log, times to perform proof reading of a character using 1) the first proof reading tool followed by using the second proof reading tool, and 2) the second proof reading tool. The method further includes determining for each character value, based on the estimated times, to use the first proof reading tool along with using the second proof reading tool and determining, or to use the second proof reading tool without using the first proof reading tool.
申请公布号 US8971670(B2) 申请公布日期 2015.03.03
申请号 US201213669789 申请日期 2012.11.06
申请人 International Business Machines Corporation 发明人 Itoh Takashi;Itoki Toshinari;Osogami Takayuki
分类号 G06K9/03 主分类号 G06K9/03
代理机构 Fleit Gibbons Gutman Bongini & Bianco PL 代理人 Fleit Gibbons Gutman Bongini & Bianco PL ;Giunta Jeffrey N.
主权项 1. A system for supporting proof reading of text data generated through optical character recognition, the system comprising: a memory and storage medium; a processor communicatively coupled to the memory and the storage medium, the processor programmed to execute machine readable instructions for performing a method comprising: performing, with a first proof reading tool, carpet proof reading on the text data; performing, with a second proof reading tool, side-by-side proof reading on the text data; storing, with a storage unit, a log of time to perform proof reading operations having been performed on a particular recognized character value within the text data serving as units by using the first proof reading tool and the second proof reading tool; and estimating, with an analysis unit, a first estimated value of time to perform proof reading of the particular recognized character value within the text data using the first proof reading tool along with the second proof reading tool, the first estimated value of time being independent of recognition accuracy of the optical character recognition, and the first estimated value of time being based upon times, recorded in the log, that had been taken to perform proof reading of the particular recognized character value using the first proof reading tool along with the second proof reading tool; estimating, with the analysis unit, a second estimated value of time to perform proof reading of the particular recognized character value within the text data using the second proof reading tool without using the first proof reading tool, the second estimated value of time being independent of recognition accuracy of the optical character recognition, and the second estimated value of time being based upon times, recorded in the log, that had been taken to perform proof reading of the particular recognized character value using the second proof reading tool without using the first proof reading tool; determining, with the analysis unit, for each particular recognized character value within the text data, based on the first estimated value being less than the second estimated value of time, to use the first proof reading tool along with using the second proof reading tool; and determining, with the analysis unit, for each particular recognized character value within the text data, based on the second estimated value being less than the first estimated value of time, to use the second proof reading tool without using the first proof reading tool.
地址 Armonk NY US