摘要 |
A method for content categorization including firstly retrieving content from a first content source from among a categorized list of content sources, extracting a plurality of words from the firstly retrieved content, associating any of the words with a category to which the firstly retrieved content is associated in the categorized list, secondly retrieving content from a second content source independently from the categorized list of content sources, extracting a plurality of words from the secondly retrieved content, and associating the secondly retrieved content with the category where any of the words in the secondly retrieved content matches any of the words in the firstly retrieved content, where the match is in accordance with a predefined heuristic.
|