摘要 |
Methods, systems and computer readable media for extracting product lines from a plurality of product titles are provided. In one embodiment, the plurality of product titles are broken into tokens. Association rules are calculated for individual tokens and pairs of tokens. Brand specific terms and product class specific terms within the product titles are identified. In one embodiment, a token tree is used to identify product lines within the list of product titles using the association rules, the brand specific terms, and the product class specific terms.
|