发明名称 |
Methods, systems, and articles of manufacture for soft hierarchical clustering of co-occurring objects |
摘要 |
Methods, systems, and articles of manufacture consistent with certain principles related to the present invention enable a computing system to perform hierarchical topical clustering of text data based on statistical modeling of co-occurrences of (document, word) pairs. The computing system may be configured to receive a collection of documents, each document including a plurality of words, and perform a modified deterministic annealing Expectation-Maximization (EM) process on the collection to produce a softly assigned hierarchy of nodes. The process may involve assigning documents and document fragments to multiple nodes in the hierarchy based on words included in the documents, such that a document may be assigned to any ancestor node included in the hierarchy, thus eliminating the hard assignment of documents in the hierarchy. |
申请公布号 |
EP1304627(B1) |
申请公布日期 |
2014.04.02 |
申请号 |
EP20020023413 |
申请日期 |
2002.10.18 |
申请人 |
XEROX CORPORATION |
发明人 |
GAUSSIER, ERIC;CHEN, FRANCINE R.;POPAT, ASHOK C. |
分类号 |
G06F12/00;G06F17/30;G06K9/62 |
主分类号 |
G06F12/00 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|