发明名称
摘要 PROBLEM TO BE SOLVED: To solve the problem in a method for addressing the low language likelihood accuracy of a word string with a small amount of learning data, that a morpheme string is classified has the disadvantage that the language constraint is weak compared with the word N-gram, and in a method in which the word N-gram is applied to the low order hierarchy, that the class N-gram being the high order hierarchy, has the disadvantage that linkage statistics of a word in the low order hierarchy and a word in the high order hierarchy cannot be estimated with a high degree of reliability because the word N-gram in the low order hierarchy is integrated as the class N-gram in the high order hierarchy. SOLUTION: An N-gram language model creation device for creating an N-gram language model by morphemes and classes from a corpus includes a first corpus that is partially grouped by morphemes and classes, a second corpus in which linkage examples of a set of morphemes that belong to a class are described by a morpheme string, and word sequence development means for embedding/developing a morpheme string of the second corpus to the classification string of the first corpus. COPYRIGHT: (C)2009,JPO&INPIT
申请公布号 JP5137588(B2) 申请公布日期 2013.02.06
申请号 JP20080002194 申请日期 2008.01.09
申请人 发明人
分类号 G10L15/187;G10L15/06;G10L15/197 主分类号 G10L15/187
代理机构 代理人
主权项
地址