发明名称 Discrete Wavelet Transform Method for Document Structure Similarity
摘要 Examples of the present disclosure may include methods, systems, and computer readable media with executable instructions. An example method for determining document structure similarity can include segmenting path sequences (206) of Document Object Model (DOM) trees (120, 462) from a number of web pages (202) into B components (561). Path signals (210) corresponding to the path sequences (206) are determined based on a count of the occurrences of particular paths in the Bth component (571), and unique path signals (210) are transformed into discrete wavelet signals (214)(572). The discrete wavelet signals (214) are analyzed at multiple DOM tree resolution levels (573).
申请公布号 US2014236968(A1) 申请公布日期 2014.08.21
申请号 US201114347572 申请日期 2011.10.31
申请人 Jiao Li-mei;Liu Jerry J.;Hou Hui-man;Yao Cong-Lei 发明人 Jiao Li-mei;Liu Jerry J.;Hou Hui-man;Yao Cong-Lei
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for determining document structure similarity, comprising: segmenting path sequences (206) of Document Object. Model (DOM) trees (120, 462) from a number of web pages (202) into B components (561); determining path signals (210) corresponding to the path sequences (206) based on a count of the occurrences of particular paths in the Bth component (571); transforming unique path signals (210) into discrete wavelet signals (214) (572); and analyzing the discrete wavelet signals (214) multiple DOM ee resolution levels (573).
地址 Beijing CN