发明名称 AUGMENTED MULTI-TIER CLASSIFIER FOR MULTI-MODAL VOICE ACTIVITY DETECTION
摘要 Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.
申请公布号 US2015058004(A1) 申请公布日期 2015.02.26
申请号 US201313974453 申请日期 2013.08.23
申请人 AT & T Intellectual Property I, L.P. 发明人 Dimitriadis Dimitrios;Zavesky Eric;Burlick Matthew
分类号 G10L15/08 主分类号 G10L15/08
代理机构 代理人
主权项 1. A method comprising: receiving, from a first classifier, a first voice activity indicator detected in a first modality for a human subject; receiving, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different; concatenating, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output; and determining voice activity based on the classifier output.
地址 Atlanta GA US