发明名称 Cost efficient distributed text-to-speech processing
摘要 Text-to-speech (TTS) processing systems may be divided among remote TTS servers which are accessible through a network connection to local user devices. The costs for performing processing on these servers may vary according to time. To improve efficiency of TTS processing certain requests may be scheduled during low cost server times. A user may indicate a preference for such low cost delivery. A user may also indicate a preference for quick turnaround time, permitting scheduling of TTS processing during higher cost server times. A TTS processing system may also consider quality of TTS results when scheduling server processing time for a particular TTS request and may allocate more server time when higher quality results are desired.
申请公布号 US9311912(B1) 申请公布日期 2016.04.12
申请号 US201313947354 申请日期 2013.07.22
申请人 Amazon Technologies, Inc. 发明人 Swietlinski Krzysztof Franciszek;Kaszczuk Michal Tadeusz
分类号 G10L13/00;G10L13/04;G10L13/02 主分类号 G10L13/00
代理机构 Seyfarth Shaw LLP 代理人 Seyfarth Shaw LLP ;Barzilay Ilan N.;Kakarla Vamsi K.
主权项 1. A method for performing text-to-speech (TTS) processing, comprising: receiving, at a server, a TTS request for TTS processing of text data into speech, wherein the TTS request is sent by a local device remote from the server and includes text data originating from the local device; receiving a user preference for TTS processing performance factors, the TTS processing performance factors including at least one of a cost of TTS processing, a quality of TTS processing or a length of time until delivery of TTS results; determining a plurality of processing options for completion of the TTS request based at least in part on the user preference, wherein the plurality of processing options vary over at least one of cost, quality and delivery time; providing the plurality of processing options to the local device; receiving a user selection of a processing option from the plurality of processing options; scheduling TTS resources for processing the TTS request based at least in part on the user selection; synthesizing the text data into speech based at least in part on the TTS resources; and providing audio data to the local device, the audio data including the synthesized speech.
地址 Reno NV US