发明名称 Document image processing apparatus
摘要 An image of a character string composed of M pieces of characters is clipped from a document image, and the image is divided character by character, and image features of each character image are extracted. On the basis of the image features, N (N>1, integer) pieces of character images in descending order of degree of similarity are selected as candidate characters from a character image feature dictionary which stores the image features of character image in units of character, and the first index matrix of M×N cells is prepared. A candidate character string composed of a plurality of candidate characters constituting the first column of the first index matrix, is subjected to a lexical analysis according to a predetermined language model, whereby a second index matrix adjusted into a character string which makes sense is prepared to be utilized for searching.
申请公布号 US8160402(B2) 申请公布日期 2012.04.17
申请号 US20080972477 申请日期 2008.01.10
申请人 WU BO;DOU JIANJUN;LE NING;WU YADONG;JIA JING;SHARP KABUSHIKI KAISHA 发明人 WU BO;DOU JIANJUN;LE NING;WU YADONG;JIA JING
分类号 G06K9/03;G06K9/18 主分类号 G06K9/03
代理机构 代理人
主权项
地址