发明名称 Speech recognition repair using contextual information
摘要 A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.
申请公布号 US8762156(B2) 申请公布日期 2014.06.24
申请号 US201113247912 申请日期 2011.09.28
申请人 Apple Inc. 发明人 Chen Lik Harry
分类号 G06F17/27;G06F17/21;G10L15/26;G10L15/06;G10L17/00;G10L15/04;G10L13/00;G10L21/00 主分类号 G06F17/27
代理机构 Morrison & Foerster LLP 代理人 Morrison & Foerster LLP
主权项 1. A machine implemented method comprising: receiving a speech input from a user of a data processing system; determining a context, of the data processing system, when the speech input was received; recognizing text in the speech input through a speech recognition system that includes an acoustic model and a language model, the recognizing of text producing a first text output; storing the first text output as a parsed data structure having a plurality of tokens each of which represents a word in the first text output; processing each of the tokens with a set of interpreters, each interpreter in the set being designed to search one or more databases to search for matches between one or more items in the databases and each of the tokens, each of the interpreters determining from any matches and from the context whether it can repair a token in the first text output, wherein each interpreter is designed to repair an error of a specific type in the first text output; merging selected results from the set of interpreters to produce a final interpreted speech transcription which represents a repaired version of the first text output; providing the final interpreted speech transcription to a selected application, in a set of applications, based on a command in the final interpreted speech transcription, the selected application to execute the command in the final interpreted speech transcription.
地址 Cupertino CA US