发明名称 |
Feature extraction for anonymized speech recognition |
摘要 |
Various of the disclosed embodiments relate to systems and methods for extracting audio information, e.g. a textual description of speech, from a speech recording while retaining the anonymity of the speaker. In certain embodiments, a third party may perform various aspects of the anonymization and speech processing. Certain embodiments facilitate anonymization in compliance with various legislative requirements even when third parties are involved. |
申请公布号 |
US9437207(B2) |
申请公布日期 |
2016.09.06 |
申请号 |
US201313856365 |
申请日期 |
2013.04.03 |
申请人 |
PULLSTRING, INC. |
发明人 |
Jacob Oren M;Reddy Martin;Langner Brian |
分类号 |
G10L21/013;G10L21/00;G10L21/003;G10L15/26;G10L15/30;H04M1/64;G10L15/02 |
主分类号 |
G10L21/013 |
代理机构 |
Perkins Coie LLP |
代理人 |
Perkins Coie LLP |
主权项 |
1. A computer-implemented method comprising:
receiving a raw audio waveform from a user device, where the raw audio waveform is recorded by the user device and contains speech data of a user; providing metadata and the raw audio waveform to a third-party processing system,
where the metadata identifies one or more frequency components to be redacted from a frequency representation of the raw audio waveform; directing the third-party processing system to perform an operation on the raw audio waveform, wherein the operation includes:
generating the frequency representation of the raw audio waveform,removing the one or more frequency components from the frequency representation that identify the user to produce a modified frequency representation, andgenerating an anonymized audio waveform from the modified frequency representation,where the third-party processing system is only allowed to perform speech processing on anonymized portions of the raw audio waveform; allowing the third-party processing system to copy elements of the anonymized audio waveform to improve a model employed by the third-party processing system to identify speech within audio waveforms; receiving a textual depiction of speech in the anonymized audio waveform from the third-party processing system; and performing a system operation based on the textual depiction. |
地址 |
San Francisco CA US |