发明名称 Speech recognition capability generation and control
摘要 A system for controlling multiple devices using automatic speech recognition (ASR) even when the devices may not be capable of performing ASR themselves. A device such as a media player, appliance, or the like may be recognized by a network. The configured controls for the device (such as a remote control or other mechanism) are incorporated into a device control registry which catalogs device command controls. Individual ASR grammars are constructed for the devices so speech commands for those devices may be processed by an ASR device. The ASR device may then process those speech commands and convert them into the appropriate inputs for the controlled device. The inputs may then be sent to the controlled device, resulting in ASR control for non-ASR devices.
申请公布号 US9443527(B1) 申请公布日期 2016.09.13
申请号 US201314040011 申请日期 2013.09.27
申请人 Amazon Technologies, Inc. 发明人 Watanabe Yuzo;Rajasekaram Arushan;Ramachandran Rajiv;Baumback Mark Steven
分类号 G10L21/00;G10L15/26;G10L15/22;G10L15/08;G10L15/30 主分类号 G10L21/00
代理机构 Seyfarth Shaw LLP 代理人 Barzilay Ilan;Miller Cyrus;Seyfarth Shaw LLP
主权项 1. A system for speech control of a non-speech processing device, the system comprising: a first device that is not capable of processing speech commands; a second device that is capable of processing speech commands; and one or more server computers; wherein the first device, the second device, and the one or more server computers are configured to perform operations comprising: establishing a communication link between the first device and the second device;obtaining, by the second device, information about an identity of the first device;transmitting, by the second device, the information about the identity of the first device to the one or more server computers;obtaining, by the one or more server computers, using the information about the identity of the first device, information about a plurality of instructions executable by the first device, the plurality of instructions including a first instruction and a second instruction;determining, by the one or more server computers, a first speech command corresponding to text describing the first instruction;determining, by the one or more server computers, a second speech command corresponding to text describing the second instruction;obtaining, by the one or more server computers, a speech processing model, wherein the speech processing model is processable to identify the first speech command or second speech command in an audio signal;receiving, by the second device, speech including a command for the first device;transmitting, by the second device, an audio signal to the one or more server computers, wherein the audio signal corresponds to the speech;determining, by the one or more server computers and using the speech processing model, that the audio signal includes the second speech command;transmitting, by the one or more server computers, an indication of the second instruction to the second device;transmitting, by the second device, the second instruction to the first device; andexecuting, by the first device, the second instruction.
地址 Reno NV US