发明名称 Enhancing perception of frequency-lowered speech
摘要 Among other things, a sound processing device system is disclosed to assist a hearing-impaired human listener recognize speech sounds or phonemes. The device system may be configured at least to generate an output audio signal at least by transposing and causing a negative rank ordering of frequency of at least a portion of the input audio signal. Compression also may be performed on the at least the portion of the input audio signal as part of generating the output audio signal. The negative rank ordering may be performed on a high-frequency portion of the input audio signal that becomes a low-frequency portion of the output audio signal by the transposing. The low-frequency portion of the output audio signal may represent an inverted ordering of frequencies or frequency segments present in the high-frequency portion of the input audio signal.
申请公布号 US9173041(B2) 申请公布日期 2015.10.27
申请号 US201313906021 申请日期 2013.05.30
申请人 PURDUE RESEARCH FOUNDATION 发明人 Alexander Joshua M.
分类号 H04R25/00;H03G5/00;G10L25/00;G10L15/00;G10L17/00;G10L15/22;G10L15/08;G10L15/26;G10L21/0208 主分类号 H04R25/00
代理机构 Rossi, Kimms & McDowell LLP 代理人 Rossi, Kimms & McDowell LLP
主权项 1. A sound processing device system configured to assist a hearing-impaired human listener recognize sounds, the sound processing device system comprising: a memory device system; and a data processing device system communicatively connected to the memory device system, the data processing device system configured by a program stored in the memory device system at least to: receive an input audio signal; and generate an output audio signal at least by transposing and causing a negative rank ordering of frequency of at least a portion of the input audio signal, wherein the input audio signal is a first portion of an input audio signal stream, wherein the output audio signal is a first portion of an output audio signal stream, and wherein the data processing device system is configured by the program at least to: identify a speech pattern present in the first portion of the input audio signal stream; generate, in response to the speech pattern being identified as present in the first portion of the input audio signal stream, the first portion of the output audio signal stream at least by inverting a frequency relationship of at least part of the first portion of the input audio signal stream; identify that the speech pattern is not present in a second portion of the input audio signal stream that is other than the first portion of the input audio signal stream; generate, in response to identifying that the speech pattern is not present in the second portion of the input audio signal stream, a second portion of the output audio signal stream without inverting the frequency relationship of at least part of the second portion of the input audio signal stream, the second portion of the output audio signal stream being other than the first portion of the output audio signal stream; identify that the first portion of the input audio signal stream exhibits higher energy at a high-frequency range as compared to a mid-frequency range of the first portion of the input audio signal stream; cause, by way of at least a gain, an attenuation, or both a gain and an attenuation, and in response to identifying that the first portion of the input audio signal stream exhibits the higher energy at the high-frequency range, a low-frequency range of the first portion of the output audio signal stream to be relatively emphasized or de-emphasized as compared to another frequency range of the first portion of the output audio signal stream or another time segment of the output audio signal stream to generate a perceptual cue to facilitate distinguishing of similar sounds, the low-frequency range of the first portion of the output audio signal stream corresponding, prior to the inverting the frequency relationship of the first portion of the input audio signal stream, to the high-frequency range of the first portion of the input audio signal stream; identify a speech pattern present in a third portion of the input audio signal stream that is other than the first portion of the input audio signal stream and the second portion of the input audio signal stream; generate, in response to the speech pattern being identified as present in the third portion of the input audio signal stream, a third portion of the output audio signal stream at least by inverting a frequency relationship of the third portion of the input audio signal stream, the third portion of the output audio signal stream being other than the first portion of the output audio signal stream and the second portion of the output audio signal stream; identify that the third portion of the input audio signal stream exhibits higher energy at a mid-frequency range as compared to a high-frequency range of the third portion of the input audio signal stream; and output the third portion of the output audio signal stream without causing, by way of at least a gain, an attenuation, or both a gain and an attenuation, a low-frequency range of the third portion of the output audio signal stream to be relatively emphasized or de-emphasized as compared to another frequency range of the third portion of the output audio signal stream or another time segment of the output audio signal stream, the low-frequency range of the third portion of the output audio signal stream corresponding, prior to the inverting the frequency relationship of the third portion of the input audio signal stream, to the high-frequency range of the third portion of the input audio signal stream.
地址 West Lafayette IN US