发明名称 METHOD, DEVICE AND PROGRAM FOR EXTRACTING INFORMATION
摘要 <P>PROBLEM TO BE SOLVED: To accurately extract a component and its attribute from a document source of a structured document. <P>SOLUTION: A pre-processing means 322 varies document source information M01 of the structured document inputted by an input means 121 so that layout analysis and attribute determination are easily performed. A rendering means 123 generates image data M05 when a source M2 after pre-processing is actually plotted. A layout analysis means 124 performs the layout analysis of the image data M05. An attribute determination means 325 inputs layout analysis information M07 and determines attributes of the component of the image data. An output means 327 outputs a document source part corresponding to the component of the image data obtained by the layout analysis as the component of the structured document together with the determined attributes. <P>COPYRIGHT: (C)2004,JPO
申请公布号 JP2004038827(A) 申请公布日期 2004.02.05
申请号 JP20020198199 申请日期 2002.07.08
申请人 NEC CORP 发明人 FUJIYAMA KENICHIRO;MATSUDA KATSUSHI
分类号 G06F17/21 主分类号 G06F17/21
代理机构 代理人
主权项
地址