主权项 |
1. A method comprising:
accessing, using one or more processors, audio data that represents query sound to be identified; creating, using the one or more processors, a spectrogram of the audio data, different segments of the spectrogram representing amplitudes at frequencies in different time slices of the query sound; determining, using the one or more processors, a dominant frequency in a time slice of the query sound based on a segment of the spectrogram, the determining including:
calculating an aggregate energy value of a candidate frequency based on amplitudes of the candidate frequency and harmonics thereof represented in the segment of the spectrogram; andidentifying the candidate frequency as the dominant frequency based on the aggregate energy value of the candidate frequency being a largest aggregate energy value among aggregate energy values of frequencies whose amplitudes are represented in the segment of the spectrogram; creating, using the one or more processors, a query harmonogram of the audio data, different segments of the query harmonogram representing aggregate energy values of dominant frequencies in different time slices of the query sound; and providing, using the one or more processors, an identifier of the query sound based on a comparison of the query harmonogram to a reference harmonogram mapped to the identifier by a database. |