主权项 |
1. A method for assigning keywords to a web page to form thereby a set of web page representative keywords, comprising:
identifying self keywords associated with the web page to form thereby a set of identified self keywords, the self keywords comprising keyword data from the web page; identifying in-link keywords associated with the web page to form thereby a set of identified in-link keywords, the in-link keywords comprising keyword data from other web pages including a link to the web page; identifying out-link keywords associated with the web page to form thereby a set of identified out-link keywords, the out-link keywords comprising keyword data from other web pages having a link to said other web pages from the web page; extracting, from each of the sets of identified self, in-link and out-link keywords, a plurality of potential keyword phrases, each keyword phrase comprising at least two keywords within a respective set of keywords; evaluating each of the identified keywords in each of the set and extracted keyword phrases according to a reference function to determine thereby valid keywords and keyword phrases; assigning weights to each of the valid self, in-link and out-link keywords and keyword phrases to form a set of weighted keywords and keyword phrases associated with the web page, wherein each of the valid in-link and out-link keywords and keyword phrases is assigned a weight according to a ranking of a respective source web page; generating a rank ordered of the valid keywords using one or more of count, unique count and weighted unique count heuristic functions; and combining, in the rank order, the valid self, in-link and out-link keywords and keyword phrases to form a set of web page representative keywords and keyword phrases associated with the web page separated by first delineators and stored in a memory. |