发明名称 Speech intelligibility predictor and applications thereof
摘要 The application relates to a method of providing a speech intelligibility predictor value for estimating an average listener's ability to understand of a target speech signal when said target speech signal is subject to a processing algorithm and/or is received in a noisy environment. The application further relates to a method of improving a listener's understanding of a target speech signal in a noisy environment and to corresponding device units. The object of the present application is to provide an alternative objective intelligibility measure, e.g. a measure that is suitable for use in a time-frequency environment. The invention may e.g. be used in audio processing systems, e.g. listening systems, e.g. hearing aid systems.
申请公布号 US9064502(B2) 申请公布日期 2015.06.23
申请号 US201113045303 申请日期 2011.03.10
申请人 OTICON A/S 发明人 Taal Cees H.;Hendriks Richard;Heusdens Richard;Kjems Ulrik;Jensen Jesper
分类号 G10L21/02;G10L15/16;G10L25/69 主分类号 G10L21/02
代理机构 Birch, Stewart, Kolasch & Birch, LLP 代理人 Birch, Stewart, Kolasch & Birch, LLP
主权项 1. A method of providing a speech intelligibility predictor value for estimating an average listener's ability to understand a target speech sound when said target speech sound is subject to a processing algorithm and/or is received in a noisy environment, the method comprising: electrically receiving a first signal x(n) representing the target speech sound as a target speech signal; a) providing a time-frequency representation, xj(m), of the first signal x(n), representing the target speech signal in a number of frequency bands and a number of time instances, j being a frequency band index and m being a time index; b) providing a time-frequency representation, yj(m), of a second signal y(n), the second signal being a noisy and/or processed version of said target speech signal in a number of frequency bands and a number of time instances; c) providing first and second intelligibility prediction inputs in the form of modified time-frequency representations xj*(m) and yj*(n) of the first and second signals or signals derived there from, respectively; d) providing time-frequency dependent intermediate speech intelligibility coefficients dj(m) based on said first and second intelligibility prediction inputs; e) calculating a final speech intelligibility predictor d by averaging said intermediate speech intelligibility coefficients dj(m) over a number J of frequency indices and a number M of time indices; wherein the speech intelligibility coefficients dj(m) at given time instants m are calculated asdj⁡(m)=∑n=N⁢⁢1N⁢⁢2⁢(xj*⁡(n)-rxj*)⁢(yj*⁡(n)-ryj*)∑n=N⁢⁢1N⁢⁢2⁢⁢(xj*⁡(n)-rxj*)2⁢∑n=N⁢⁢1N⁢⁢2⁢(yj*⁡(n)-ryj*)2 where xj*(n) and yj*(n) are effective amplitudes of the j'th time-frequency unit at time instant n of the first and second intelligibility prediction inputs, respectively, and where N1≦m≦N2, rx*j and ry*j are constants, and N2−N1≦400 ms.
地址 Smorum DK