主权项 |
1. A method comprising:
capturing sound, at two or more microphones of a device, from an environment; generating output data based on the sound; filtering the output data into a plurality of portions, each portion corresponding to one of a plurality of frequency sub-bands; and processing a first portion of the plurality of portions, the first portion corresponding to a first frequency sub-band, the processing comprising:
identifying first audio data in the first portion associated with a user;determining a first direction related to the first audio data by triangulating a first position of the user relative to the device based on an a first analysis of the output data, the first analysis including determining a first order in which the two or more microphones captured sound associated with the user;identifying that second audio data is present within the first portion, the second audio data corresponding to a source different from the user;classifying the second audio data as background noise;determining that the second audio data is causing a reduction to a speech-to-noise ratio;determining, using the first portion and a first beamformer, the first beamformer configured to operate on data corresponding to a first frequency sub-band of the plurality of frequency sub-bands, a second direction related to the source by triangulating a second position of the source relative to the device based on a second analysis of the output data, the second analysis including determining a second order in which the two or more microphones captured sound associated with the second audio data, the second direction different than the first direction; andadding a weighted signal to further output of the two or more microphones to determine:
attenuated third audio data corresponding to the first frequency sub-band and the second direction, andamplified fourth audio data corresponding to the first frequency sub-band. |