发明名称 Method and apparatus for detecting talking segments in a video sequence using visual cues
摘要 A method and system for detecting temporal segments of talking faces in a video sequence using visual cues. The system detects talking segments by classifying talking and non-talking segments in a sequence of image frames using visual cues. The present disclosure detects temporal segments of talking faces in video sequences by first localizing face, eyes, and hence, a mouth region. Then, the localized mouth regions across the video frames are encoded in terms of integrated gradient histogram (IGH) of visual features and quantified using evaluated entropy of the IGH. The time series data of entropy values from each frame is further clustered using online temporal segmentation (K-Means clustering) algorithm to distinguish talking mouth patterns from other mouth movements. Such segmented time series data is then used to enhance the emotion recognition system.
申请公布号 US9110501(B2) 申请公布日期 2015.08.18
申请号 US201313800486 申请日期 2013.03.13
申请人 Samsung Electronics Co., Ltd. 发明人 Velusamy Sudha;Gopalakrishnan Viswanath;Navathe Bilva Bhalachandra;Sharma Anshul
分类号 G06K9/00;G06F3/01;G06K9/46 主分类号 G06K9/00
代理机构 NSIP Law 代理人 NSIP Law
主权项 1. A method for detecting and classifying talking segments of a face in a visual cue in order to infer emotions, the method comprising: normalizing and localizing a face region for each frame of the visual cue; obtaining a histogram of structure descriptive features of the face for the frame in the visual cue; deriving an integrated gradient histogram (IGH) from the descriptive features for the frame in the visual cue; computing entropy of the IGH for the frame in the visual cue; performing segmentation of the IGH to detect talking segments for the face in the visual cues; and analyzing the segments for the frame in the visual cues to infer emotions.
地址 Suwon-si KR