发明名称 TEXT CLASSIFICATION PROCESSING METHOD, TEXT CLASSIFICATION PROCESSING DEVICE AND TEXT CLASSIFICATION PROCESSING PROGRAM
摘要 PROBLEM TO BE SOLVED: To provide a text classification processing method for determining whether a text belongs to a certain category or not. SOLUTION: The text classification processing method comprises extracting a character string of a fixed length or less from the text, calculating a characteristic quantity of the character string, and generating a characteristic vector originated from the characteristic quantity. When a training data group with a label related to whether the text thereof belongs to a certain category or not being preliminarily assigned is given, the text of the training data group is converted to the characteristic vector, the characteristic vector is applied to a support vector machine together with the label to perform learning, and a text classifier by the support vector machine is generated. When a text which is unknow whether it belongs to a certain category or not is given, the characteristic vector of the unknown text is generated, and whether the text belongs to the category or not is determined by use of the generated text classifier. COPYRIGHT: (C)2008,JPO&INPIT
申请公布号 JP2008084064(A) 申请公布日期 2008.04.10
申请号 JP20060264088 申请日期 2006.09.28
申请人 NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL & TECHNOLOGY 发明人 SADOHARA TAKESHI
分类号 G06F17/30;G06N3/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址