发明名称 Language-independent, non-semantic speech analytics
摘要 A method for language-independent, non-semantic speech analytics that may analyze spoken utterances without regard for the language or speakers, comprising the steps of receiving an audio input containing human speech, analyzing the audio to identify the waveform pattern, and analyzing the waveform to identify periods of silence, and additional methods for alternative non-speech-based speech analysis, and a system for non-speech-based analysis comprising a media server that receives audio input, an analytics server that processes the audio input, and a management server that configures operation of the analytics server.
申请公布号 US9230542(B2) 申请公布日期 2016.01.05
申请号 US201414231740 申请日期 2014.04.01
申请人 ZOOM INTERNATIONAL S.R.O. 发明人 Velasco Moses
分类号 G10L15/00;G10L15/18 主分类号 G10L15/00
代理机构 Galvin Patent Law 代理人 Galvin Patent Law ;Galvin Brian R.
主权项 1. A method for language-independent, non-semantic speech analytics, comprising the steps: receiving, at a media server stored and operating on a network-connected analytics server computer, an audio input from a plurality of network-connected devices; analyzing, using the analytics server computer, the audio input to determine an audio waveform; analyzing, using the analytics server computer, the waveform to determine a plurality of periods of silence wherein the plurality of periods of silence are detected by a plurality of valleys in the amplitude of the waveform; analyzing, using the analytics server computer, the waveform to identify a plurality of units of speech wherein the plurality of units of speech are identified by a plurality of peaks in the amplitude of the waveform; analyzing, using the analytics server computer, the units of speech within the waveform to determine speech characteristics, including at least a pace of speech during an interaction and a change in pace of speech during an interaction wherein the change in pace is identified by successive stages of analysis utilizing results of previous stages; analyzing, using the analytics server computer, the waveform to determine a plurality of periods of cross-talk wherein two or more interaction participants are speaking simultaneously wherein a talk ratio is calculated to determine at least a contribution of each of the two or more interaction participants and a quantity of cross talk in the waveform wherein the contribution is computed by determining the relative speaking time of each of the two or more speakers as a fraction of total interaction time; analyzing, using the analytics server, the waveform to determine an emotional state of a speaker wherein the emotional state of the speaker is determined by the quantity of cross talk in the waveform; analyzing, using the analytics server, a speech pattern using at least a pace of speech; identifying an unknown speaker based on the speech pattern wherein identifying the unknown speaker is determined by comparing the speech pattern to a plurality of previously stored speech patterns; storing the results of waveform analysis for future reference in a database stored and operating on a network-attached computer; and sending the results of the waveform analysis to a client computing device for viewing by a user.
地址 Prague CZ