发明名称 IMAGE DOCUMENT PROCESSING APPARATUS, IMAGE DOCUMENT PROCESSING METHOD, IMAGE PROCESSING PROGRAM, AND RECORDING MEDIUM ON WHICH IMAGE PROCESSING PROGRAM IS RECORDED
摘要 <p><P>PROBLEM TO BE SOLVED: To provide an image document processing apparatus, and an image document processing method, in each of which index information is improved to achieve higher search precision. <P>SOLUTION: An image of a character string composed of M pieces of characters is clipped from an image document, and the image is divided into separate characters, image features of each character image are extracted, based on the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters, from a character image feature dictionary which stores the image features of character image in units of character, and a first index matrix of MxN-th cells of the clipped character strings is prepared. A candidate character string composed of a plurality of candidate characters constituting a first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, and whereby a second index matrix having adjusted the candidate character string to a character string which makes sense is prepared, in the language model, statistics are taken and then, the lexical analysis is performed. <P>COPYRIGHT: (C)2009,JPO&INPIT</p>
申请公布号 JP2009026288(A) 申请公布日期 2009.02.05
申请号 JP20070246158 申请日期 2007.09.21
申请人 SHARP CORP 发明人 WU BO;DOU JIANJUN;LE NING;GO ATOU;JIA JING
分类号 G06T1/00;G06F17/30 主分类号 G06T1/00
代理机构 代理人
主权项
地址