发明名称 DETERMINING DOCUMENT STRUCTURE SIMILARITY USING DISCRETE WAVELET TRANSFORMATION
摘要 Examples of the present disclosure may include methods, systems, and computer readable media with executable instructions. An example method for determining document structure similarity can include segmenting path sequences (206) of Document Object Model (DOM) trees (120, 462) from a number of web pages (202) into B components (561 ). Path signals (210) corresponding to the path sequences (206) are determined based on a count of the occurrences of particular paths in the Bth component (571 ), and unique path signals (210) are transformed into discrete wavelet signals (214)(572). The discrete wavelet signals (214) are analyzed at multiple DOM tree resolution levels (573).
申请公布号 WO2013063734(A1) 申请公布日期 2013.05.10
申请号 WO2011CN81540 申请日期 2011.10.31
申请人 HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.;JIAO, LIMEI;LIU, J., JERRY;HOU, HUIMAN;YAO, CONGLEI 发明人 JIAO, LIMEI;LIU, J., JERRY;HOU, HUIMAN;YAO, CONGLEI
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址