发明名称 WEBSITE TOP PAGE PRESUMPTION DEVICE, TOP PAGE PRESUMPTION METHOD, PROGRAM FOR THIS METHOD, RECORDING MEDIUM WITH THE PROGRAM RECORDED THEREON
摘要 PROBLEM TO BE SOLVED: To appropriately presume the top page of a website, and to perform information retrieval by website unit suited to a retrieval purpose starting from the top page. SOLUTION: Every server name to which each page of a Web page set belongs is extracted (S1), a URL, a server name, a directory layer and meta- information of each page are extracted (S2), using a page classification tree of each page, a classification likelihood for the page type is extracted (S3), for each server, a page which the directory layer thereof is 0 and has a file name located in the layer is presumed as the top page (S4), if the top page is not presumed, a directory layer in which a top page exists with a top page type classification likelihood is determined (S5), a page, in the directory layer, for which a file exists in a lower layer and the sum of classification likelihood to the page type is maximum is determined as the top page for every directory layer (S6), and if the top page is absent, a page for which the top page classification likelihood is equal to or more than a threshold in a layer one level lower than the layer is determined as the top page (S7). COPYRIGHT: (C)2003,JPO
申请公布号 JP2003186731(A) 申请公布日期 2003.07.04
申请号 JP20010389447 申请日期 2001.12.21
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 MORI KENICHI
分类号 G06F17/30;G06F12/00;(IPC1-7):G06F12/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址