发明名称 Utterance state detection device and utterance state detection method
摘要 An utterance state detection device includes an user voice stream data input unit that gets user voice stream data of an user, a frequency element extraction unit that extracts high frequency elements by frequency-analyzing the user voice stream data, a fluctuation degree calculation unit that calculates a fluctuation degree of the high frequency elements thus extracted every unit time, a statistic calculation unit that calculates a statistic every certain interval based on a plurality of the fluctuation degrees in a certain period of time, and an utterance state detection unit that detects an utterance state of a specified user based on the statistic obtained from user voice stream data of the specified user.
申请公布号 US9099088(B2) 申请公布日期 2015.08.04
申请号 US201113064871 申请日期 2011.04.21
申请人 FUJITSU LIMITED 发明人 Washio Nobuyuki;Harada Shouji;Kamano Akira;Matsuo Naoshi
分类号 G10L15/00;G10L17/26;G10L25/48;G10L21/00;G10L13/00;G08B23/00 主分类号 G10L15/00
代理机构 Staas & Halsey LLP 代理人 Staas & Halsey LLP
主权项 1. An utterance emotional state detection device, comprising: a memory that stores a reply model including statistically processed information relating to a reply of a user in a normal state; and a processor coupled to the memory, wherein the processor executes a process comprising: acquiring user voice stream data of a specified user; extracting high frequency elements from the user voice stream data by frequency-analyzing; first calculating a fluctuation degree of the extracted high frequency elements for every unit of time; second calculating a statistic for every certain interval in the user voice stream data based on a plurality of fluctuation degrees in the every certain interval, the statistic being a representative value obtained from the fluctuation degrees in the every certain interval; determining that an utterance in the user voice stream data is a reply when a time length of the utterance is smaller than a first predetermined threshold; and determining an utterance emotional state of the specified user based on the statistic and the reply obtained from the user voice stream data of the specified user, wherein the determining of the utterance emotional state includes determining the utterance emotional state of the specified user in a reply period, wherein the reply period is determined from a plurality of replies that continuously appear in the user voice stream data, wherein each reply in the plurality of replies being smaller than the first predetermined threshold and where each reply of the plurality of replies in the reply period are compared to the reply model stored in the memory.
地址 Kawasaki JP
您可能感兴趣的专利