摘要 |
<p>A mass-scale, user-independent, device-independent, voice messaging system for converting an unstructured audio message from a caller into text for display on a screen. The voice messaging system comprises an automatic speech recognition system configured to generate raw text, given an input speech signal. The voice messaging system also comprises a computer implemented pre-processing front-end subsystem configured to determine an appropriate conversion strategy used to convert the audio message by: classifying the message type of the audio message using a plurality of detectors which analyse the audio message, and based on the classified message type, either routing the audio message for conversion by the automatic speech recognition system or not routing the audio message for conversion by the automatic speech recognition system.</p> |