发明名称 Method and apparatus for language and speaker recognition
摘要 An initial learning phase creates histograms for each of the languages to be recognized. A first pass enters a number of samples of speech, and at each predetermined instant of time, each sample of speech is Fast Fourier Transformed (FFT) to create a spectrum showing frequency content of the speech at that instant of time (a spectral vector). The frequency content is compared with frequency contents which have been previously stored. If the current spectral vector is close enough to a previously stored spectral vector, a weighted average between the two is formed, and a weight indicating frequency of occurrence is incremented. If the current value is not similar to one which has been previously stored, it is stored with an initial weight of "1". The most common frequency spectra are determined for all of the languages grouped together to form a composite basis set. A second pass then puts a sample of sounds through the Fast Fourier Transform to again obtain frequency spectrums. The obtained frequency spectrums are compared against all of the prestored frequency spectra in the composite basis set, and a closest match is determined. A number of occurrences of each frequency spectra in the composite basis set is plotted as a histogram. This histogram is used during the recognition phase to determine a closest fit between an unknown language and one of the known languages.
申请公布号 US5189727(A) 申请公布日期 1993.02.23
申请号 US19910752898 申请日期 1991.08.26
申请人 ELECTRONIC WARFARE ASSOCIATES, INC.;SYSTEMS TECHNOLOGY ASSOCIATES 发明人 GUERRERI, STEPHEN J.
分类号 G10L17/00 主分类号 G10L17/00
代理机构 代理人
主权项
地址