发明名称 |
SYSTEM AND METHOD FOR COMPRESSED DOMAIN LANGUAGE IDENTIFICATION |
摘要 |
Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages. |
申请公布号 |
US2016093290(A1) |
申请公布日期 |
2016.03.31 |
申请号 |
US201414499867 |
申请日期 |
2014.09.29 |
申请人 |
Nuance Communications, Inc. |
发明人 |
Lainez Jose;Barreda Daniel Almendro |
分类号 |
G10L15/00;H04M3/42;G10L15/16;G06F17/30;G06N3/08 |
主分类号 |
G10L15/00 |
代理机构 |
|
代理人 |
|
主权项 |
1. A compressed domain language identification method comprising:
receiving a bitstream of a sequence of packets at one or more computing devices; classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD); extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format; generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation; and providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages. |
地址 |
Burlington MA US |