发明名称 Objective speech quality metric
摘要 Methods and systems are provided for using a model of human speech quality perception to provide an objective measure for predicting subjective quality assessments. A Virtual Speech Quality Objective Listener (ViSQOL) model is a signal-based full-reference metric that uses a spectro-temporal measure of similarity between a reference signal and test speech signal. Specifically, the model provides for the ability to detect and predict the level of clock drift, and determine whether such clock drift will impact a listener's quality of experience.
申请公布号 US9524733(B2) 申请公布日期 2016.12.20
申请号 US201313891978 申请日期 2013.05.10
申请人 Google Inc. 发明人 Skoglund Jan;Hines Andrew J.;Harte Noami A.;Kokaram Anil
分类号 G10L25/60 主分类号 G10L25/60
代理机构 Brake Hughes Bellermann LLP 代理人 Brake Hughes Bellermann LLP
主权项 1. A method for determining speech quality comprising: receiving a first signal and a second signal, wherein the second signal is a degraded version of the first signal; creating a time-frequency representation for each of the two signals; using the time-frequency representation for the first signal to select at least one portion of the first signal containing speech data; identifying, based on time-frequency representation for the second signal, at least one portion of the second signal corresponding to the at least one portion of the first signal; determining a level of similarity between the second signal and the first signal based on a comparison of the at least one portion of the second signal and the corresponding at least one portion of the first signal, wherein the level of similarity is determined using Neurogram Similarity Index Measure (NSIM); and generating a speech quality estimate based on the level of similarity determined using NSIM.
地址 Mountain View CA US