发明名称 |
Extracting information from symbolically compressed document images |
摘要 |
A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.
|
申请公布号 |
US6658151(B2) |
申请公布日期 |
2003.12.02 |
申请号 |
US19990289772 |
申请日期 |
1999.04.08 |
申请人 |
RICOH CO., LTD. |
发明人 |
LEE DAR-SHYANG;HULL JONATHAN J. |
分类号 |
H04N1/40;G06K9/72;H03M7/30;H04N1/41;(IPC1-7):G06K9/72 |
主分类号 |
H04N1/40 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|