发明名称 DOCUMENT DATA ANALYZER AND DOCUMENT DATA ANALYSIS PROGRAM
摘要 PROBLEM TO BE SOLVED: To determine a keyword for classification into respective categories of document data without the need of expert knowledge. SOLUTION: An input part 1 inputs several pieces of the document data including category information. A document analysis part 2 recognizes the category information included in the document data inputted from the input part 1, the document analysis part 2 segments the respective words of a sentence included in the document data inputted by the input part 1, and the document analysis part 2 generates word frequency data by calculating the frequency of the words included in the document for the respective document data and stores them in a word frequency storage part 3. A word importance calculation part 4 calculates the value of the importance of each word in the document data belonging to a common category for each category on the basis of the word frequency data stored in the word frequency storage part 3. On the basis of a calculated result by the word importance calculation part 4, an output part 5 extracts the word for which the value of the word importance is large for each category. COPYRIGHT: (C)2007,JPO&INPIT
申请公布号 JP2007241636(A) 申请公布日期 2007.09.20
申请号 JP20060062903 申请日期 2006.03.08
申请人 TOSHIBA CORP;TOSHIBA SOLUTIONS CORP 发明人 KANO TOSHIYUKI;MATSUMOTO SHIGERU;TAIRA HIROSHI;SO KUNITAKE
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址