发明名称 Computer Implemented System and Method for Identifying Significant Speech Frames Within Speech Signals
摘要 The present disclosure envisages a computer implemented system for identifying significant speech frames within speech signals for facilitating speech recognition. The system receives an input speech signal having a plurality of feature vectors which is passed through a spectrum analyzer. The spectrum analyzer divides the input speech signal into a plurality of speech frames and computes a spectral magnitude of each of the speech frames. There is provided a suitability engine which is enabled to compute a suitability measure for each of the speech frames corresponding to spectral flatness measure (SFM), energy normalized variance (ENV), entropy, signal-to-noise ratio (SNR) and similarity measure. The suitability engine further computes a weighted suitability measure for each of the speech frames.
申请公布号 US2016155441(A1) 申请公布日期 2016.06.02
申请号 US201514670149 申请日期 2015.03.26
申请人 TATA CONSULTANCY SERVICES LTD. 发明人 Panda Ashish;Kopparapu Sunil Kumar
分类号 G10L15/20;G10L25/93 主分类号 G10L15/20
代理机构 代理人
主权项 1. A computer implemented system for identifying significant speech frames within speech signals for facilitating speech recognition, said system comprising: an input module configured to accept at least an input speech signal, wherein the speech signal is represented by a plurality of feature vectors; a spectrum analyzer cooperating with said input module to receive the input speech signal, said spectrum analyzer comprising a divider configured to divide the input speech signal into plurality of speech frames, said spectrum analyzer further configured to compute a spectral magnitude of each of the speech frames; an extractor cooperating with said spectrum analyzer to receive said speech frames, and configured to extract at least a feature vector from each of the speech frames; a suitability engine cooperating with the spectrum analyzer to receive the spectral magnitude of each of the speech frames and configured to compute a suitability measure for said speech frames, said suitability engine comprising: a spectral flatness module configured to receive the spectral magnitude of each of the speech frames and compute a spectral flatness measure to determine suitability measure for each of said speech frames;an energy normalized variance module configured to receive the spectral magnitude of each of the said speech frames and compute an energy normalized variance to determine suitability measure for each of said speech frames;an entropy module configured to receive the spectral magnitude of each of the said speech frames and compute entropy to determine suitability measure for each of said speech frames;a signal-to-noise ratio module configured to receive the spectral magnitude of each of the said speech frames and compute a frame level signal-to-noise ratio to determine suitability measure for each of said speech frames;a similarity measure module configured to receive the spectral magnitude of each of the said speech frames and compute a similarity measure to determine suitability measure for each of said speech frames;a final suitability measure module configured to receive from the spectral flatness module, the energy normalized variance module, the entropy module, the signal-to-noise ratio module and the similarity measure module, the computed suitability measure based on the spectral flatness measure, the energy normalized variance, the entropy, the frame level signal-to-noise ratio and the similarity measure of each of said speech frames respectively, and configured to compute a final suitability measure for each of said speech frames; and a frame weight assigner cooperating with the spectrum analyzer to receive the spectral magnitude of each of said speech frames and the suitability engine to receive the final suitability measure of each of said speech frames, and configured to compute weight for each of said speech frames to identify significant speech frames based on the spectral magnitude and the final suitability measure of respective speech frame.
地址 Mumbai IN