发明名称 AUTOMATIC SEMANTIC INFORMATION EXTRACTION FROM WEB DOCUMENTS FOR SEMANTIC WEB ANNOTATION
摘要 A method and a system for automatically extracting semantic information from a web document for a semantic web annotation are provided to accelerate semantic and automatic tasks of large capacity web. A system for automatically extracting semantic information from a web document comprises a learning data generator(100), an integrated classifier generator(400) and a semantic information extractor(800). The learning data generator collects large capacity web documents, eliminates HTML tags from the collected web documents, disjoints compound words, and generates learning data to which semantic tags are attached via a learning data editor. The integrated classifier generator generates a support vector machine(200) and a Bayesian classifier by using the learning data, and integrates the support vector machine with the Bayesian classifier. The semantic information extractor automatically extracts semantic information from new web documents via the integrated classifier, and generates the semantic information as ontology instances.
申请公布号 KR20080029417(A) 申请公布日期 2008.04.03
申请号 KR20060095510 申请日期 2006.09.29
申请人 KIM, HONG KI 发明人 KIM, HONG KI;KANG, BO YOUNG;GOO, SANG OK;CHOI, HEE CHUL;ZHENG HAI TAO
分类号 G06F17/21;G06F17/00 主分类号 G06F17/21
代理机构 代理人
主权项
地址