Method of automatic language identification for multi-lingual text recognition,申请号US20020305499-传众专利搜索

发明名称	Method of automatic language identification for multi-lingual text recognition
摘要	The disclosed invention utilizes a complex estimation-based approach to identify languages of portions of a multi-lingual text, recognized from a bit-mapped image. The method comprises besides the traditional steps like the document segmentation, new ones such as generating and testing of a hypothesis about the characters in the word tokens. The method further includes definition of selected language models set, word estimation via language models, dictionaries set definition for language selection, estimation of word correspondence with chosen languages, calculating a complex estimation for the word taking into account the most or all of above mentioned estimations. The complex estimation may also include factor of characters and/or words mutual correspondence within the line and/or the text, mutual geometric correspondence of characters within the word and/or the line, linguistic correspondence of the word with neighbors, estimation of image of word token reconstruction accuracy in the presence of distortion.
申请公布号	US2004006467(A1)	申请公布日期	2004.01.08
申请号	US20020305499	申请日期	2002.11.29
申请人	ANISIMOVICH KONSTANTIN;TERESHCHENKO VADIM;RYBKIN VLADIMIR	发明人	ANISIMOVICH KONSTANTIN;TERESHCHENKO VADIM;RYBKIN VLADIMIR
分类号	G06K9/68;(IPC1-7):G10L15/04	主分类号	G06K9/68
代理机构		代理人
主权项
地址