发明名称 |
AUTOMATIC VISUAL SEGMENTATION OF WEBPAGES |
摘要 |
To provide valuable information regarding a webpage, the webpage must be divided into distinct semantically coherent segments for analysis. A set of heuristics allow a segmentation algorithm to identify an optimal number of segments for a given webpage or any portion thereof more accurately. A first heuristic estimates the optimal number of segments for any given webpage or portion thereof. A second heuristic coalesces segments where the number of segments identified far exceeds the optimal number recommended. A third heuristic coalesces segments corresponding to a portion of a webpage with much unused whitespace and little content. A fourth heuristic coalesces segments of nodes that have a recommended number of segments below a certain threshold into segments of other nodes. A fifth heuristic recursively analyzes and splits segments that correspond to webpage portions surpassing a certain threshold portion size.
|
申请公布号 |
US2009177959(A1) |
申请公布日期 |
2009.07.09 |
申请号 |
US20080971160 |
申请日期 |
2008.01.08 |
申请人 |
CHAKRABARTI DEEPAYAN;MITAL MANAV RATAN;HAJELA SWAPNIL;VELIPASAOGLU EMRE |
发明人 |
CHAKRABARTI DEEPAYAN;MITAL MANAV RATAN;HAJELA SWAPNIL;VELIPASAOGLU EMRE |
分类号 |
G06F17/21 |
主分类号 |
G06F17/21 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|