发明名称 SYSTEM AND METHOD FOR PROCESSING MULTI-MODAL DEVICE INTERACTIONS IN A NATURAL LANGUAGE VOICE SERVICES ENVIRONMENT
摘要 A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.
申请公布号 US2014249822(A1) 申请公布日期 2014.09.04
申请号 US201414278645 申请日期 2014.05.15
申请人 VOICEBOX TECHNOLOGIES CORPORATION 发明人 BALDWIN LARRY;WEIDER CHRIS
分类号 G10L15/18;G10L15/22 主分类号 G10L15/18
代理机构 代理人
主权项 1. A method for processing one or more multi-modal device interactions in a natural language voice services environment that includes one or more electronic devices, comprising: detecting at least one multi-modal device interaction, wherein the multi-modal device interaction includes a non-voice interaction with at least one of the electronic devices or an application associated with at least one of the electronic devices, and wherein the multi-modal device interaction further includes at least one natural language utterance relating to the non-voice interaction; extracting context information relating to the multi-modal device interaction, wherein the extracted context information includes context relating to the non-voice interaction, and wherein the extracted context information further include context relating to the natural language utterance; combining the context relating to the non-voice interaction and the context relating to the natural language utterance; determining an intent of the multi-modal device interaction based on the combined context relating to the non-voice interaction and the natural language utterance; and routing at least one request to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.
地址 BELLEVUE WA US