发明名称 METHOD FOR EXTRACTING DOCUMENT INFORMATION AND MACHINE-READABLE RECORDING MEDIUM RECORDED WITH PROGRAM FOR ALLOWING COMPUTER TO EXECUTE THE SAME METHOD
摘要 PROBLEM TO BE SOLVED: To realize highly reliable keyword extracting and document retrieval by extracting a keyword to which layout information is added. SOLUTION: This method comprises a step S201 of inputting a document image, a step S202 of extracting layout information from the document image, a step S203 of recognizing a character in a character area extracted by the step S202, and obtaining a character code string, and a step S204 of extracting a keyword from the character code string by language analysis, and weighting the keyword based on the plural layout information.
申请公布号 JP2000067080(A) 申请公布日期 2000.03.03
申请号 JP19980246520 申请日期 1998.08.18
申请人 RICOH CO LTD 发明人 SAITO TAKASHI;UCHIKI TAKAHIRO
分类号 G06F17/30;G06T1/00;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址