发明名称 TEXT DATA STUDY ANALYSIS SYSTEM, TEXT DATA STUDY DEVICE, TEXT DATA ANALYSIS DEVICE, ITS METHOD AND PROGRAM
摘要 PROBLEM TO BE SOLVED: To provide a text data study analysis system for simply analyzing text data without using a dictionary depending on an intended text or dividing the text data for every content of the text; and to provide a text data study device, a text data analysis device, its method and its program. SOLUTION: An extraction means 102 extracts a plurality of features for characterizing study data from the study data; a generation means 105 generates vectors showing whether the respective features are included in the respective text data or not; a division means 104 divides the vectors into belonging vectors each belonging to a certain class and non-belonging vectors without belonging to it on the basis of class included in the study data; a computation means 106 computes a model for determining whether an arbitrary vector is a belonging vector or not on a class basis based on the belonging vector; a presumption means 108 presumes a class matching to the content of the text data corresponding to each vector applied to each model of a plurality of models; a calculation means 109 calculates frequencies at which several features selected from the plurality of features appear in evaluation data on a class basis; and the feature related to each class is selected based on the frequency for every class. COPYRIGHT: (C)2006,JPO&NCIPI
申请公布号 JP2006085634(A) 申请公布日期 2006.03.30
申请号 JP20040272377 申请日期 2004.09.17
申请人 TOSHIBA CORP 发明人 SAKURAI SHIGEAKI
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址