发明名称 Active speaker location detection
摘要 Various examples related to determining a location of an active speaker are provided. In one example, image data of a room from an image capture device is received and a three dimensional model is generated. First audio data from a first microphone array at the image capture device is received. Second audio data from a second microphone array laterally spaced from the image capture device is received. Using the three dimensional model, a location of the second microphone array with respect to the image capture device is determined. Using the audio data and the location and angular orientation of the second microphone array, an estimated location of the active speaker is determined. Using the estimated location, a setting for the image capture device is determined and outputted to highlight the active speaker.
申请公布号 US9621795(B1) 申请公布日期 2017.04.11
申请号 US201614991847 申请日期 2016.01.08
申请人 MICROSOFT TECHNOLOGY LICENSING, LLC 发明人 Whyte Oliver Arthur;Cutler Ross;Bhattacharjee Avronil;Kowdle Adarsh Prakash Murthy;Kirk Adam;Birchfield Stanley T.;Zhang Cha
分类号 H04N7/15;H04N5/232;H04R3/00;H04R29/00;G06T7/00;H04N7/14 主分类号 H04N7/15
代理机构 Alleman Hall McCoy Russell & Tuttle LLP 代理人 Alleman Hall McCoy Russell & Tuttle LLP
主权项 1. A method for determining a location of an active speaker, the method comprising: from an image capture device, receiving image data of a room in which the active speaker and at least one inactive speaker are located; using the image data, generating a three dimensional model of at least a portion of the room; from a first microphone array at the image capture device, receiving first audio data from the room; from a second microphone array that is laterally spaced from the image capture device, receiving second audio data from the room; using the three dimensional model, determining a location of the second microphone array with respect to the image capture device; using at least the first audio data, the second audio data, the location of the second microphone array, and an angular orientation of the second microphone array, determining an estimated location in the three dimensional model of the active speaker; using the estimated location of the active speaker to compute a setting for the image capture device; and outputting the setting to control the image capture device to highlight the active speaker.
地址 Redmond WA US