主权项 |
1. A method for evaluating a content segment for relevancy to a plurality of categories, the method comprising:
retrieving the content segment; for each of a plurality of categories, calculating a score for the content segment to determine the relevancy of the content segment to the category by using a context-based model for the category, said context-based model comprising (i) a set of groups of word sets, each group of word sets comprising a key word set and a second word set, (ii) scores for the groups of word sets, and (iii) a definition of context that specifies when a second word set is in a context of a key word set, wherein the context for a particular key word set is based on a specified relationship, within the content segment, between the particular key word set and a second word set, wherein the calculating comprises:
identifying each key word set from the context-based model that is in the content segment;for each identified key word set, associating the key word set and each word set within the context of the key word set in the content segment as a different group of word sets; andaggregating scores for each of the associated groups of word sets to calculate the score for the content segment; and for each of the plurality of categories, tagging the content segment with a category tag to signify that the content segment is relevant to the category when the content segment is determined to be relevant to the category. |