摘要 |
<p>PROBLEM TO BE SOLVED: To perform speaker matching with a high authentication rate by a speech authenticating system which identifies a person by speech without being affected adversely by the pitch and strength of the speech or noise. SOLUTION: Any of speech signal data vocalized and inputted when they are registered and matched by a registered person and a matched person respectively are sectioned into frames in specific time units to calculate their LSP coefficients(Line Spectrum Pair) and converted into a two-dimensional LSP image wherein the LSP values by the frames are arranged in time series, image positions Ti where maximum correlation coefficients in an LSP image of the matched speech are detected, and the identity between the registered speech and matched speech is decided according to differencesΔi at the respective image positions, so speech matching can be carried out by using the LSP coefficients representing features of the spoken voice more than a speech analysis by FFT.</p> |