发明名称 Speech retrieval method, speech retrieval apparatus, and program for speech retrieval apparatus
摘要 A method for speech retrieval includes acquiring a keyword designated by a character string, and a phoneme string or a syllable string, detecting one or more coinciding segments by comparing a character string that is a recognition result of word speech recognition with words as recognition units performed for speech data to be retrieved and the character string of the keyword, calculating an evaluation value of each of the one or more segments by using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string that is recognized in each of the detected one or more segments and that is a recognition result of phoneme speech recognition with phonemes or syllables as recognition units performed for the speech data, and outputting a segment in which the calculated evaluation value exceeds a predetermined threshold.
申请公布号 US9626957(B2) 申请公布日期 2017.04.18
申请号 US201615167522 申请日期 2016.05.27
申请人 SINOEAST CONCEPT LIMITED 发明人 Kurata Gakuto;Nagano Tohru;Nishimura Masafumi
分类号 G10L15/02;G10L15/187;G10L15/04;G10L25/51;G10L15/08 主分类号 G10L15/02
代理机构 Anova Law Group, PLLC 代理人 Anova Law Group, PLLC
主权项 1. A speech retrieval apparatus comprising: a segment detection unit configured to detect one or more coinciding segments for speech data by comparing a character string of a recognition result of word speech recognition and a character string of a keyword, the keyword being designated by the character string and a phoneme string or a syllable string stored in a non-transitory computer readable storage medium; an evaluation value calculation unit configured to calculate an evaluation value of each of the one or more coinciding segments using the phoneme string or the syllable string of the keyword to evaluate a phoneme string or a syllable string recognized in each of the one or more coinciding segments and that is a recognition result of phoneme speech recognition, wherein the phoneme string or the syllable string associated with each of the one or more coinciding segments is a phoneme string or a syllable string associated with a segment n which a start and an end of the segment is expanded by a predetermined time; and a segment output unit configured to output a segment in which the calculated evaluation value exceeds a predetermined threshold.
地址 Wanchai HK