发明名称 DOCUMENT PROCESSING DEVICE AND METHOD
摘要 <P>PROBLEM TO BE SOLVED: To provide a document processing device that can assign appropriate semantic tags to various documents. <P>SOLUTION: A general proper expression extraction part 11 and a semantic role word extraction part 12 extract general proper expressions and semantic role words from an input document 100, and a general document structure analysis part 13 computes a basic document structure. A document type identification part 15 selects a document type for the input document by comparing a resultant document model based on the general proper expressions and semantic role words with each of document models based on general proper expressions and semantic role words which are defined for respective document types. A detailed document structure detection part 16 detects substructures of the input document according to information on detailed document structure based on general proper expressions and semantic role words which is defined for the document type. A semantic tag assignment part 17 assigns semantic tags predefined for the detailed document structure to the detected substructures to create an output document 101. <P>COPYRIGHT: (C)2007,JPO&INPIT
申请公布号 JP2007094855(A) 申请公布日期 2007.04.12
申请号 JP20050284885 申请日期 2005.09.29
申请人 TOSHIBA CORP 发明人 NUNOME MITSUO;ISHITANI YASUTO
分类号 G06F17/21;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址