摘要 |
A text mining program is provided that allows a user to perform text mining operations, such as: information retrieval, term and document visualization, term and document clustering, term and document classification, summarization of individual documents and groups of documents, and document cross-referencing. This is accomplished by representing the text of a document collection using subspace transformations. This subspace transformation representation is performed by: constructing a term frequency matrix of the term frequencies for each of the documents, transforming the term frequencies for statistical purposes, and projecting the documents or the terms into a lower dimensional subspace. As the document collection is updated, the subspace is dynamically updated to reflect the new document collection.
|
申请人 |
THE BOEING COMPANY |
发明人 |
BILLHEIMER, D., DEAN;BOOKER, ANDREW, JAMES;CONDLIFF, MICHELLE, KEIM;GREAVES, MARK, THOMAS;HOLT, FREDRICK, BADEN;KAO, ANNE, SHU-WAN;PIERCE, DANIEL, JOHN;POTEET, STEPHEN, ROBERT;WU, YUAN-JYE |