发明名称 VOICE SEARCH DEVICE, VOICE SEARCH METHOD, AND NON-TRANSITORY RECORDING MEDIUM
摘要 A search word acquiring unit acquires a search word. A converting unit converts the search word into a phoneme sequence. An output probability acquiring unit acquires, for each frame, an output probability of a feature quantity of a target voice signal being output from each phoneme included in the phoneme sequence. A relative calculating unit executes relative calculation of the output probability acquired from each phoneme by the output probability acquirer, based on an output probability acquired from another phoneme included in the phoneme sequence. A zone designating unit successively designates a likelihood acquisition zones. A likelihood calculating unit acquires a likelihood indicating how likely a likelihood acquisition zone designated by the zone designator is a zone in which voice corresponding to the search word is spoken. An identifying unit identifies from the target voice signal an estimated zone for which the voice corresponding to the search word is estimated to be spoken, based on the likelihood acquired by the likelihood acquiring unit.
申请公布号 US2015255060(A1) 申请公布日期 2015.09.10
申请号 US201514597958 申请日期 2015.01.15
申请人 CASIO COMPUTER CO., LTD. 发明人 TOMITA Hiroki
分类号 G10L15/14;G10L15/18;G10L15/02 主分类号 G10L15/14
代理机构 代理人
主权项 1. A voice search device comprising: a search word acquirer acquiring a search word; a converter converting the search word acquired by the search word acquirer into a phoneme sequence; an output probability acquirer acquiring, for each frame, an output probability of a feature quantity of a target voice signal being output from each phoneme included in the phoneme sequence; a relative calculator executing a relative calculation of the output probability acquired from each phoneme by the output probability acquirer, based on an output probability acquired from another phoneme included in the phoneme sequence; a zone designator designating a plurality of likelihood acquisition zones in the target voice signal; a likelihood acquirer acquiring a likelihood indicating how likely a likelihood acquisition zone designated by the zone designator is a zone in which voice corresponding to the search word is spoken, based on the output probability after the calculation by the relative calculator; and an identifier identifying from the target voice signal an estimated zone for which the voice corresponding to the search word is estimated to be spoken, based on the likelihood acquired by the likelihood acquirer from each likelihood acquisition zone designated by the zone designator.
地址 Tokyo JP