发明名称 SPEAKER VERIFICATION USING CO-LOCATION INFORMATION
摘要 Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a user in a multi-user environment. One of the methods includes receiving, by a first user device, an audio signal encoding an utterance, obtaining, by the first user device, a first speaker model for a first user of the first user device, obtaining, by the first user device for a second user of a second user device that is co-located with the first user device, a second speaker model for the second user or a second score that indicates a respective likelihood that the utterance was spoken by the second user, and determining, by the first user device, that the utterance was spoken by the first user using (i) the first speaker model and the second speaker model or (ii) the first speaker model and the second score.
申请公布号 US2016019896(A1) 申请公布日期 2016.01.21
申请号 US201414335380 申请日期 2014.07.18
申请人 Google Inc. 发明人 Alvarez Guevara Raziel;Hansson Othar
分类号 G10L17/00 主分类号 G10L17/00
代理机构 代理人
主权项 1. A computer-implemented method comprising: receiving, by a first user device, an audio signal encoding an utterance; obtaining, by the first user device, a first speaker model that is specific to a first user of the first user device; determining, by the first user device, that a second user device used by a second user is co-located with the first user device; in response to determining that the second user device used by the second user is co-located with the first user device, determining, by the first user device, whether the first user device has one or more settings that allow the first user device access to a second speaker model that is specific to the second user; in response to determining that the first user device has one or more settings that allow the first user device access to the second speaker model, obtaining, by the first user device from the second user device used by the second user, a second speaker model that is specific to the second user; determining, by the first user device, that the utterance was spoken by the first user using the first speaker model that is specific to the first user of the first user device and the second speaker model that is specific to the second user associated with the second user device; analyzing the audio signal to identify a command included in the utterance in response to determining that the utterance was spoken by the first user; and performing an action that corresponds with the command.
地址 Mountain View CA US