发明名称 Speech recognition apparatus, speech recognition method, and computer-readable recording medium
摘要 A speech recognition apparatus 20 includes: an identification language model creation unit 21 that selects, from learning texts 27 for various fields for generating language models 26 for the fields, a phrase that includes a word whose appearance frequency satisfies a set condition on a field-by-field basis, and generates an identification language model 25 for identifying the field of speech using the selected phrases; a speech recognition unit 22 that executes speech recognition on the speech using the identification language model 25, and outputs text data and word confidences as a recognition result; and a field determination unit 23 that specifies a field that includes the most words whose confidences are greater than or equal to a set value based on the text data, the word confidences, and the words in the learning texts for the fields, and determines that the specified field is the field of the speech.
申请公布号 US9142211(B2) 申请公布日期 2015.09.22
申请号 US201313766247 申请日期 2013.02.13
申请人 NEC CORPORATION 发明人 Sakai Atsunori
分类号 G06F17/27;G10L15/18;G10L15/06 主分类号 G06F17/27
代理机构 Sughrue Mion, PLLC 代理人 Sughrue Mion, PLLC
主权项 1. A speech recognition apparatus comprising: an identification language model creation unit that selects, from learning texts for a plurality of fields for generating language models for the fields, a phrase that includes a word whose appearance frequency satisfies a set condition on a field-by-field basis, and generates an identification language model for identifying the field of input speech using the selected phrases; a speech recognition unit that executes speech recognition on the input speech using the identification language model, and outputs text data and a confidence for each word included in the text data as a recognition result; and a field determination unit that specifies a field that includes the most words whose confidences are greater than or equal to a set value based on the text data, the confidences of the words, and the words included in the learning texts for the fields, and determines that the specified field is the field of the input speech, wherein the identification language model creation unit comprises: an appearance frequency list creation unit that, for each field, generates an appearance frequency list in which the words included in the corresponding learning text are arranged based on appearance frequency, and in which a word other than a noun and a word that appears in a plurality of learning texts for different fields have been removed; a text selection unit that, for each field, specifies a word whose appearance frequency satisfies a set condition from the appearance frequency list, and selects a phrase that includes the specified word from the learning text; and a creation processing unit that generates the identification language model using the phrases selected for each field, wherein the field determination unit compares the words included in the text data and the appearance frequency lists for the fields, specifies an appearance frequency list that includes the most words whose confidences are greater than or equal to a set value, and determines that the field of the specified appearance frequency list is the field of the input speech, wherein the speech recognition apparatus further comprises a language model reconstruction unit that acquires the phrases selected by the identification language model creation unit for each field other than the specified field, adds the acquired phrases as a learning text to the language model in the specified field and reconstructs the language model in the specified field, wherein the speech recognition unit again executes speech recognition on the input speech using the language model reconstructed by the language model reconstruction unit.
地址 Tokyo JP