摘要 |
<P>PROBLEM TO BE SOLVED: To accurately extract a component and its attribute from a document source of a structured document. <P>SOLUTION: A pre-processing means 322 varies document source information M01 of the structured document inputted by an input means 121 so that layout analysis and attribute determination are easily performed. A rendering means 123 generates image data M05 when a source M2 after pre-processing is actually plotted. A layout analysis means 124 performs the layout analysis of the image data M05. An attribute determination means 325 inputs layout analysis information M07 and determines attributes of the component of the image data. An output means 327 outputs a document source part corresponding to the component of the image data obtained by the layout analysis as the component of the structured document together with the determined attributes. <P>COPYRIGHT: (C)2004,JPO |