发明名称 Speech-enabled content navigation and control of a distributed multimodal browser
摘要 Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.
申请公布号 US8862475(B2) 申请公布日期 2014.10.14
申请号 US200711734445 申请日期 2007.04.12
申请人 Nuance Communications, Inc. 发明人 Ativanichayaphong Soonthorn;Cross, Jr. Charles W.;McCobb Gerald M.
分类号 G10L21/00;G10L25/00;G06F15/16;G06F3/00;G06F3/16;G10L15/26 主分类号 G10L21/00
代理机构 Wolf, Greenfield & Sacks, P.C. 代理人 Wolf, Greenfield & Sacks, P.C.
主权项 1. A computer-implemented method of speech-enabled content navigation and control of a distributed multimodal browser, the distributed multimodal browser providing an execution environment for a multimodal application, the distributed multimodal browser including a graphical user agent and a voice user agent operatively coupled to the graphical user agent, the graphical user agent operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the voice user agent operating on a voice server, the method comprising: transmitting, by the graphical user agent, a link message to the voice user agent, the link message containing voice commands that control the distributed multimodal browser including at least one grammar associated with the voice commands, the link message also containing an event corresponding to each voice command, wherein at least one of the voice commands is received by the graphical user agent from a voice markup corresponding to the multimodal application; receiving, by the graphical user agent, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the graphical user agent, the voice utterance to the voice user agent for speech recognition by the voice user agent; receiving, by the graphical user agent, an event message from the voice user agent, the event message specifying a particular event corresponding to the particular voice command specified by the voice utterance; and controlling, by the graphical user agent, the distributed multimodal browser in dependence upon the particular event.
地址 Burlington MA US