发明名称 SYSTEM AND METHOD FOR COMPRESSED DOMAIN LANGUAGE IDENTIFICATION
摘要 Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.
申请公布号 US2016093290(A1) 申请公布日期 2016.03.31
申请号 US201414499867 申请日期 2014.09.29
申请人 Nuance Communications, Inc. 发明人 Lainez Jose;Barreda Daniel Almendro
分类号 G10L15/00;H04M3/42;G10L15/16;G06F17/30;G06N3/08 主分类号 G10L15/00
代理机构 代理人
主权项 1. A compressed domain language identification method comprising: receiving a bitstream of a sequence of packets at one or more computing devices; classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD); extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format; generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation; and providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.
地址 Burlington MA US