发明名称 |
LANGUAGE MODEL WITH STRUCTURED PENALTY |
摘要 |
A penalized loss is optimized using a corpus of language samples respective to a set of parameters of a language model. The penalized loss includes a function measuring predictive accuracy of the language model respective to the corpus of language samples and a penalty comprising a tree-structured norm. The trained language model with optimized values for the parameters generated by the optimizing is applied to predict a symbol following sequence of symbols of the language modeled by the language model. In some embodiments the penalty comprises a tree-structured „“ p -norm, such as a tree-structured „“ 2 -norm or a tree-structured „“ ˆž -norm. In some embodiments a tree-structured „“ ˆž -norm operates on a collapsed suffix trie in which any series of suffixes of increasing lengths which are always observed in the same context are collapsed into a single node. The optimizing may be performed using a proximal step algorithm. |
申请公布号 |
EP2996045(A1) |
申请公布日期 |
2016.03.16 |
申请号 |
EP20150182813 |
申请日期 |
2015.08.27 |
申请人 |
XEROX CORPORATION |
发明人 |
NELAKANTI, ANIL KUMAR;BOUCHARD, GUILLAUME M.;ARCHAMBEAU, CEDRIC;BACH, FRANCIS;MAIREL, JULIEN |
分类号 |
G06F17/27 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|