发明名称 形態素解析器生成装置、形態素解析器生成方法、及び、プログラム
摘要 PROBLEM TO BE SOLVED: To provide a method for generation of a morphological analyzer, which can be performed to the estimation of a part of speech by learning without a teacher.SOLUTION: A method includes: a step of generating and storing NPYLM showing the probability in which a following substring appears as being subject to a certain substring, by using two or more sentences stored in a learning data memory unit; and a step of reading a sentence from the learning data memory unit, estimating the most probable space between words by using CRF introducing a feature function using the argument of a latent variable representing a part of speech of each substring and an appearance probability of the substring calculated in NPYLM, updating parameters of the CRF using the space between words found by Blocked Gibbs sampling from the end of a sentence toward the beginning of the sentence as teacher data, and repeating the processing updating the NPYLM based on the space between words until satisfying convergence conditions. The sentence in which the space between words and the parameters of the CRF are updated is learned again after eliminating the substring constituting the space between words found last time and its connection information from the NPYLM.
申请公布号 JP6062816(B2) 申请公布日期 2017.01.18
申请号 JP20130148399 申请日期 2013.07.17
申请人 株式会社デンソーアイティーラボラトリ 发明人 内海 慶
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址