发明名称 Modified levenshtein distance algorithm for coding
摘要 <p>Methods and systems of mapping of an optical character recognition (OCR) text string to a code included in a coding dictionary by supplementing the Levenshtein Distance Algorithm (LDA) with additional information in the form of adjustments based on particular character substitutions, insertions and deletions together with weighting based on multiple alternatives for the OCR text string. An OCR text string mapping method 100 includes receiving 110 an OCR text string, comparing 120 it with selected text strings from a coding dictionary, computing 130 modified Levenshtein distances associated with the comparisons by determining substitution 140, insertion 150 and deletion 160 penalties, and combining 170 the penalties, selecting 180 the best matching text string from the coding dictionary based on the modified Levenshtein distances, determining 190 whether a maximum threshold distance is met, and assigning 200 a code associated with the best matching text string to the OCR text string when met, and assigning 210 a null or no code when not met.</p>
申请公布号 GB2434477(A) 申请公布日期 2007.07.25
申请号 GB20070001002 申请日期 2007.01.18
申请人 LOCKHEED MARTIN CORPORATION 发明人 KURT P KOPCHIK;OREN I OXMAN;TIMOTHY O WITHUM
分类号 G06K9/20;G06F17/22;G06K17/00 主分类号 G06K9/20
代理机构 代理人
主权项
地址