主权项 |
1. A computer program product comprising a non-transitory computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising computer readable program code configured to:
analyze a web browsing interaction history of a user associated with a given web page, the web browsing interaction history indicating that the user interacted with at least one element of the given web page; construct a document object model (DOM) of the given web page; identify, based on the analyzed web browsing interaction history, a node within the DOM corresponding to the at least one element in the given web page; identify an ancestor node of the node; determine a distribution of node tag types for a set of child nodes associated with the identified ancestor node, and determine that the identified ancestor node comprises an item list including the at least one element within the given web page based on at least the distribution satisfying a given threshold, wherein each item in the item list corresponds to a child node in the set of child nodes, wherein determining that the identified ancestor node comprises an item list further comprises:
determining, based on the distribution, a number of matching child tags associated with each of the set of child nodes, wherein each of the matching child tags is a structure based tag, and wherein a structure based tag affects a structure of a web page;determining if the number of matching child tags satisfies the given threshold;based on at least the number of matching child tags satisfying the given threshold, identifying the identified ancestor node as a candidate node comprising the item list; andbased on at least the number of matching child tags failing to satisfy the given threshold, determining that the identified ancestor node comprises the item list. |