摘要 |
A system, method and computer program product provides a solution to a class of categorization problems using a semi-supervised clustering approach, the method employing performing a Soft Seeded k-means algorithm, which makes effective use of the side information provided by seeds with a wide range of confidence levels, even when they do not provide complete coverage of the pre-defined categories. The semi-supervised clustering is achieved through the introductions of a seed re-assignment penalty measure and model selection measure.
|