摘要 |
Methods, systems and computer program products for generating a hierarchical representation of a hypertext markup language (HTML) document. A state of a web page is captures at a point in time. A plurality of content elements of the captured web page are identified. The content elements are organized to provide a grouping of the content elements based on an associated type and/or content of respective ones of the content elements to provide the hierarchical representation of the HTML document.
|