发明名称 METHOD AND SYSTEM FOR CONSONANT-VOWEL RATIO MODIFICATION FOR IMPROVING SPEECH PERCEPTION
摘要 Increasing the level of the consonant segments relative to the nearby vowel segments, known as consonant-vowel ratio (CVR) modification, is reported to be effective in improving speech intelligibility by listeners in noisy backgrounds and by hearing-impaired listeners. A method along with a system for real-time CVR modification using the rate of change of spectral centroid for detection of spectral transitions is disclosed. A preferred embodiment of the invention using a 16-bit fixed point processor with on-chip FFT hardware is also presented for real-time signal processing. It can be integrated with other FFT-based signal processing in communication devices, hearing aids, and other systems for improving speech perception under adverse listening conditions.
申请公布号 US2016365099(A1) 申请公布日期 2016.12.15
申请号 US201515121599 申请日期 2015.01.27
申请人 INDIAN INSTITUTE OF TECHNOLOGY BOMBAY 发明人 Pandey Prem Chand;Jayan Ammanath Ramakrishnan;Tiwari Nitya
分类号 G10L21/0232;G10L25/21;G10L25/87;G10L21/0264;G10L21/0364 主分类号 G10L21/0232
代理机构 代理人
主权项 1. A method for improving speech perception, by processing a digital speech signal, the method comprising: detecting perceptually salient segments in said digital speech signal; calculating time-varying gain in accordance with a location and energy of said detected segments in said digital speech signal; and applying said time-varying gain to said digital speech signal, wherein detecting said perceptually salient segments and calculating said time-varying gain comprises: windowing samples of said digital speech signal to form overlapping frames and calculating energy of said frames;smoothening said frame energy by a moving-average filter to obtain smoothened short-time energy;applying a peak detector with exponential decay on said frame energy to track peak energy;generating a low-frequency tone and multiplying said low-frequency tone with said peak energy and adding a resulting scaled tone to said digital speech signal to obtain a tone-added signal;windowing said tone-added signal and applying Discrete Fourier transform (DFT) to obtain short-time magnitude spectrum of said tone-added signal;applying a moving-average filter on said short-time magnitude spectrum to obtain a smoothened short-time magnitude spectrum;calculating a spectral centroid of said smoothened short-time magnitude spectrum;smoothening said spectral centroid by median filtering to obtain a smoothened spectral centroid;calculating a first-difference of said smoothened spectral centroid to obtain a rate of change of said smoothened spectral centroid; andselecting said time-varying gain using said smoothened short-time energy, said peak energy, and said rate of change of said smoothened spectral centroid wherein modification of perceptually salient segments is carried out without significantly increasing loudness level of said digital speech signal.
地址 Maharashtra IN