发明名称 CHARACTER IMAGE FEATURE DICTIONARY PREPARATION APPARATUS, DOCUMENT IMAGE PROCESSING APPARATUS HAVING THE SAME, CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM, RECORDING MEDIUM ON WHICH CHARACTER IMAGE FEATURE DICTIONARY PREPARATION PROGRAM IS RECORDED, DOCUMENT IMAGE PROCESSING PROGRAM, AND RECORDING MEDIUM ON WHICH DOCUMENT IMAGE PROCESSING PROGRAM IS RECORDED
摘要 An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided character by character, and image features of each character image are extracted. On the basis of the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters from a character image feature dictionary which stores the image features of character image in units of character, and the first index matrix of MxN cells is prepared. A candidate character string composed of a plurality of candidate characters constituting the first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, whereby a second index matrix adjusted into a character string which makes sense is prepared to he utilized for searching.
申请公布号 US2009028445(A1) 申请公布日期 2009.01.29
申请号 US20080972477 申请日期 2008.01.10
申请人 WU BO;DOU JIANJUN;LE NING;WU YADONG;JIA JING 发明人 WU BO;DOU JIANJUN;LE NING;WU YADONG;JIA JING
分类号 G06K9/72 主分类号 G06K9/72
代理机构 代理人
主权项
地址