发明名称 System and method for improved string matching under noisy channel conditions
摘要 Described is a system and method for improving string matching in a noisy channel environment. The invention provides a method for identifying string candidates and analyzing the probability that the string candidate matches a user-defined string. In one implementation, a find engine receives a query string, converts an image file into a textual file, and identifies each instance of the query string in the textual file. The find engine identifies candidates within the textual file that may match the query string. The find engine refers to a confusion table to help identify whether candidates that are near matches to the query string are actually matches to the query string but for a common recognition error. Candidates meeting a probability threshold are identified as matches to the query string. The invention further provides for analysis options including word heuristics, language models, and OCR confidences.
申请公布号 US2003028522(A1) 申请公布日期 2003.02.06
申请号 US20010918791 申请日期 2001.07.30
申请人 MICROSOFT CORPORATION 发明人 COLLINS-THOMPSON KEVYN;SCHWEIZER CHARLES B.
分类号 G06F17/27;G06F17/30;G06K9/03;G06K9/72;(IPC1-7):G06F7/00 主分类号 G06F17/27
代理机构 代理人
主权项
地址