发明名称 |
METHOD TO SELECT LEARNING TEXT FOR LANGUAGE MODEL, METHOD TO LEARN LANGUAGE MODEL BY USING THE SAME LEARNING TEXT, AND COMPUTER AND COMPUTER PROGRAM FOR EXECUTING THE METHODS |
摘要 |
PROBLEM TO BE SOLVED: To provide a technique to efficiently collect sentences resembling sentences contained in the corpus of an object area from a corpus outside the corpus of the object area.SOLUTION: A technique to select a learning text for a language model comprises a generating technique to replace one or more words in the corpus of a first domain with a special symbol or symbols representing any random word or word string and to use the replacing word string as a template for selecting the learning text; or selection as the learning text in accordance with at least one generating technique to use a word string from the corpus of the first domain and selection of a text covered by the template as the learning text from the corpus of a second domain differing from the first domain.SELECTED DRAWING: Figure 2A |
申请公布号 |
JP2016024759(A) |
申请公布日期 |
2016.02.08 |
申请号 |
JP20140150554 |
申请日期 |
2014.07.24 |
申请人 |
INTERNATIONAL BUSINESS MASCHINES CORPORATION |
发明人 |
KURATA TAKEHITO;ITO NOBUYASU;NISHIMURA MASAFUMI |
分类号 |
G06F17/27;G10L15/06;G10L15/197 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|