发明名称 METHOD FOR EXTRACTING SITE OMITTED FROM CLASSIFICATION SERVICE USING DOCUMENT COLLECTING ROBOT AND ESTIMATING SUBJECT
摘要 PURPOSE: A method for extracting a site omitted from a classification service using a document collecting robot and estimating a subject is provided to reduce the time for a classification service by enabling a document collecting robot to search a site not registered in the classification service and to confirm information on contents, links, and distributions of the links of the site, thereby estimating a subject. CONSTITUTION: A document collecting robot finds a new web page(201). It is checked whether the web page is included in a site registered in a classification service(202). In case that the web page isn't included in the site, a subject of a page linked with the web page is extracted(203). The subject is stored in an outer classification page information storage(204). A document collected is stored in a document base(205). Grouping of web pages are performed on the basis of representative pages(206). Only classification object candidates are extracted(207). Statistics on the subject are compiled and calculated(208). Addresses of the candidate sites and allocable categories are outputted and transmitted to classification personnel(209). Sites newly found are stored according to a classification work of the classification personnel(210). In case that the web page is included in the classification service, the web page is stored in the document base(211).
申请公布号 KR20010064753(A) 申请公布日期 2001.07.11
申请号 KR19990059043 申请日期 1999.12.18
申请人 KOREA TELECOM 发明人 KIM, HYEON JEONG;KIM, YEONG MIN;LEE, SANG YEOP;SHIN, EUN GYEONG
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址