发明名称 Service Oriented Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle User Interfaces Requiring Minimal Cognitive Driver Processing for Same
摘要 A system and method for implementing a server-based speech recognition system for multimodal automated interaction in a vehicle includes receiving, by a vehicle driver, audio prompts by an on-board human-to-machine interface and a response with speech to complete tasks such as creating and sending text messages, web browsing, navigation, etc. This service-oriented architecture is utilized to call upon specialized speech recognizers in an adaptive fashion. The human-to-machine interface enables completion of a text input task while driving a vehicle in a way that minimizes the frequency of the driver's visual and mechanical interactions with the interface, thereby eliminating unsafe distractions during driving conditions. After the initial prompting, the typing task is followed by a computerized verbalization of the text. Subsequent interface steps can be visual in nature, or involve only sound.
申请公布号 US2016071518(A1) 申请公布日期 2016.03.10
申请号 US201514940525 申请日期 2015.11.13
申请人 Sirius XM Connected Vehicle Services Inc. 发明人 Schalk Thomas Barton;Saenz Leonel;Burch Barry
分类号 G10L15/22;G06F17/30;G10L15/30;G06F3/16;G10L15/08 主分类号 G10L15/22
代理机构 代理人
主权项 1. A method for implementing an interactive automated system, comprising: processing spoken utterances of a person using a processing system located in proximity to the person; transmitting the processed speech information to a remote data center using a wireless link; analyzing the transmitted speech information; based upon an indicated intent of the spoken utterances, selecting at least one optimal speech recognition engine from a set of speech recognition engines; converting the analyzed speech information into packet data format to produce packet speech information; using an internet-protocol transport network, transporting the packet speech information to the selected at least one optimal speech recognition engine and recognizing the converted speech information with the selected at least one optimal speech recognition engine; retrieving recognition results and an associated confidence score from the selected at least one optimal speech recognition engine; if the confidence score meets or exceeds a predetermined threshold for a best match, processing the recognition results to: perform a search;generate search results;transport the search results to the processing system; andpresent the search results to the person; and if the confidence score is below the predetermined threshold, selecting at least one alternative optimal speech recognition engine to carry out recognition of the converted speech information.
地址 Irving TX US