发明名称 |
Document categorizing method, document categorizing apparatus, and storage medium on which a document categorization program is stored |
摘要 |
A document categorizing apparatus includes a sentence analyzer 12 for analyzing a plurality of documents to detect titles thereof; a feature element extractor 13 for extracting feature elements from the titles detected by the sentence analyzer 12 from the respective documents; feature table generating means 14 for generating a feature table representing the relationships between the feature elements extracted from the title and the documents including the feature elements; a document categorizing unit 15 for categorizing the documents into a plurality of clusters according to semantic similarity on the basis of the content of the feature table; a categorization result storage unit 16 for storing the clusters created by the document categorization unit 15 ; a cluster merging unit 2 for performing a cluster merging process upon the clusters stored in the categorization result storage unit 6 ; and an output control unit 31 for outputting the result of the cluster merging process to a display unit 32.
|
申请公布号 |
US7213205(B1) |
申请公布日期 |
2007.05.01 |
申请号 |
US20000762126 |
申请日期 |
2000.06.02 |
申请人 |
SEIKO EPSON CORPORATION |
发明人 |
MIWA SHINJI;NAGAISHI MICHIHIRO |
分类号 |
G06F17/00;G06F17/27;G06F17/30 |
主分类号 |
G06F17/00 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|