发明名称 Bimodal emotion recognition method and system utilizing a support vector machine
摘要 A method is disclosed in the present disclosure for recognizing emotion by setting different weights to at least of two kinds of unknown information, such as image and audio information, based on their recognition reliability respectively. The weights are determined by the distance between test data and hyperplane and the standard deviation of training data and normalized by the mean distance between training data and hyperplane, representing the classification reliability of different information. The method recognizes the emotion according to the unidentified information having higher weights while the at least two kinds of unidentified information have different result classified by the hyperplane and correcting wrong classification result of the other unidentified information so as to raise the accuracy while emotion recognition. Meanwhile, the present disclosure also provides a learning step with a characteristic of higher learning speed through an algorithm of iteration.
申请公布号 US8965762(B2) 申请公布日期 2015.02.24
申请号 US201113022418 申请日期 2011.02.07
申请人 Industrial Technology Research Institute 发明人 Song Kai-Tai;Han Meng-Ju;Hsu Jing-Huai;Hong Jung-Wei;Chang Fuh-Yu
分类号 G10L25/51;G10L25/63;G10L17/26;G06K9/00 主分类号 G10L25/51
代理机构 WPAT, PC 代理人 WPAT, PC ;King Justin
主权项 1. A method used for emotion recognition comprising the steps of: (a) establishing hyperplanes, further comprising the steps of: (a1) establishing a plurality of training samples; and (a2) using a means of support vector machine (SVM) to establish the hyperplanes basing upon the plurality of training samples (b) inputting at least two unknown data to be identified while enabling each unknown data to correspond to one of the hyperplanes whereas there are two emotion category being defined in the one of the hyperplanes, and each unknown data being a data selected from an image data and a vocal data; (c) respectively performing a calculation process, using a computer, upon the at least two unknown data for assigning each with a weight, the calculation process further comprising the steps of: (c1) basing upon the plurality of training samples used for establishing the one of the hyperplanes to acquire a standard deviation and a mean distance between the plurality of training samples and the one of the hyperplanes; (c2) respectively calculating feature distances between the one of the hyperplanes and the at least two unknown data to be identified; and (c3) obtaining the weights of the at least two unknown data by performing a mathematic operation upon the feature distances, the plurality of training samples, the mean distance and the standard deviation, the mathematic operation further comprising the steps of: obtaining differences between the feature distances and the standard deviation; and normalizing the differences for obtaining the weights, wherein weights of facial image ZFi and weights of vocal data ZAi are obtained whereinZFi=DFi-σFDFave-σF,for⁢⁢i=1∼N,⁢andZAi=DAi-σADAave-σA,for⁢⁢i=1∼N; Wherein the DFave and the DAave represent average distances between the plurality of training samples and the one of the hyperplanes of facial and speech training data respectively, the σF and the σA represent standard deviations of facial and speech training data respectively, the DFi and the DAi represent distances between the facial and speech test samples and the corresponding one of the hyperplanes respectively; and (d) comparing the assigned weight of the two unknown data while using the comparison as base for selecting one emotion category out of a plurality of emotion categories as an emotion recognition result.
地址 Hsinchu TW