发明名称 |
Disambiguation in speech recognition |
摘要 |
Automatic speech recognition (ASR) processing including a feedback configuration to allow for improved disambiguation between ASR hypotheses. After ASR processing of an incoming utterance where the ASR outputs an N-best list including multiple hypotheses, the multiple hypotheses are passed downstream for further processing. The downstream further processing may include natural language understanding (NLU) or other processing to determine a command result for each hypothesis. The command results are compared to determine if any hypotheses of the N-best list would yield similar command results. If so, the hypothesis(es) with similar results are removed from the N-best list so that only one hypothesis of the similar results remains in the N-best list. The remaining non-similar hypotheses are sent for disambiguation, or, if only one hypothesis remains, it is sent for execution. |
申请公布号 |
US9558740(B1) |
申请公布日期 |
2017.01.31 |
申请号 |
US201514673343 |
申请日期 |
2015.03.30 |
申请人 |
Amazon Technologies, Inc. |
发明人 |
Mairesse Francois;Raccuglia Paul Frederick;Vitaladevuni Shiv Naga Prasad;Reavely Simon Peter |
分类号 |
G10L15/04;G10L17/00;G10L15/00;G10L15/08;G10L15/22 |
主分类号 |
G10L15/04 |
代理机构 |
Seyfarth Shaw LLP |
代理人 |
Seyfarth Shaw LLP ;Barzilay Ilan N.;Cartwright Tyrus S. |
主权项 |
1. A method for performing automatic speech recognition (ASR) processing on an utterance including a search request, the method comprising:
receiving, from a mobile device, audio data corresponding to an utterance, the utterance comprising a search request; performing ASR processing on the audio data to determine ASR results, the ASR results comprising a first ASR hypothesis, a second ASR hypothesis and a third ASR hypothesis; determining a disambiguation group, wherein the disambiguation group comprises the first ASR hypothesis, the second ASR hypothesis and the third ASR hypothesis; processing, by a search engine:
the first ASR hypothesis to determine first search results comprising a first plurality of entities,the second ASR hypothesis to determine second search results comprising a second plurality of entities, andthe third ASR hypothesis to determine third search results comprising a second plurality of entities; determining that the second ASR hypothesis is similar to the third ASR hypothesis as a result of overlap between the second plurality of entities and the third plurality of entities; determining a revised disambiguation group, wherein the revised disambiguation group comprises the first ASR hypothesis and the second ASR hypothesis; and sending, to the mobile device, data corresponding to the revised disambiguation group for disambiguation. |
地址 |
Seattle WA US |