摘要 |
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for classifying utterances. The methods, systems, and apparatus include actions of obtaining an audio input signal representing an utterance of a user. Additional actions may include determining that a shape of at least a portion of the audio input signal matches a shape of at least a portion of an audio trigger signal corresponding to a keyword. Further actions may include, based at least on determining that the shape of at least the portion of the audio input signal matches the shape of at least the portion of an audio trigger signal corresponding to the keyword, classifying the utterance as a trigger utterance that corresponds to the keyword. |