发明名称 Incorrect Hyperlink Detecting Apparatus and Method
摘要 An incorrect hyperlink detecting apparatus which can detect a semantic inconsistency of a hyperlink with high accuracy is provided. An incorrect hyperlink detecting apparatus 10 includes a link source text extracting unit 12 for extracting a text from an HTML file 26 of a link source, a link destination text extracting unit 14 for extracting a text from the HTML file 26 of a link destination, a morpheme analysis unit 18 for dissolving the extracted texts into words, a weighting unit 18 for assigning a weightier every part of speech, a consistency rate calculating unit 20 for calculating a rate that the words of the link source are included in the words of the Sink destination as a consistency rate from the link source to the link destination and a rate that the words of the Sink destination are included in the words of the Sink source as a consistency rate from the link destination to the link source, degree of association calculating unit 22 for calculating a degree of association which indicates a probability of the hyperlink in response to both of the consistency rates, and a CSV output unit 24 for outputting the consistency rate and the degree of association in a CSV form.
申请公布号 US2008172220(A1) 申请公布日期 2008.07.17
申请号 US20070623135 申请日期 2007.01.15
申请人 OHSHIMA NORIKO 发明人 OHSHIMA NORIKO
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址