摘要 |
<p><P>PROBLEM TO BE SOLVED: To provide a method and device used for extracting webpage contents, which obtain further optimal webpage extraction results. <P>SOLUTION: The method and device used for extracting webpage contents are disclosed. This method includes: extracting the webpage contents for webpage input based on the digital document analysis (DDA) method and creating the DDA extraction result; extracting the web page contents for web page input based on the document image resolution (DIR) method and creating the DIR extraction result; and combining the DDA extraction result and the DIR extraction result and creating the combination result. <P>COPYRIGHT: (C)2009,JPO&INPIT</p> |