发明名称 |
Bootstrapping named entity canonicalizers from English using alignment models |
摘要 |
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training recognition canonical representations corresponding to named-entity phrases in a second natural language based on translating a set of allowable expressions with canonical representations from a first natural language, which may be generated by expanding a context-free grammar for the allowable expressions for the first natural language. |
申请公布号 |
US9146919(B2) |
申请公布日期 |
2015.09.29 |
申请号 |
US201313830969 |
申请日期 |
2013.03.14 |
申请人 |
Google Inc. |
发明人 |
Epstein Mark Edward;Mengibar Pedro J. |
分类号 |
G06F17/28;G06F17/27 |
主分类号 |
G06F17/28 |
代理机构 |
Fish & Richardson P.C. |
代理人 |
Fish & Richardson P.C. |
主权项 |
1. A computer-implemented method, the method comprising:
receiving a set of acceptable expressions, each acceptable expression being a string that identifies a value of a variable entity in a first natural language, each acceptable expression being associated with a canonical representation of the value identified by that expression; performing, a first machine translator that translates expressions from the first natural language to a second natural language, machine translation on each acceptable expression in the first natural language to obtain a translated expression of the acceptable expression in the second natural language; associating the canonical representation associated with each acceptable expression with the corresponding translated expression in the second natural language; providing a set of training data for training a second machine translator that translates expressions in the second natural language that each include a respective translated expression to expressions in the second natural language that each include a respective canonical representation, the set of training data comprising the translated expressions and the canonical representations that are associated with the translated expressions; and using the second machine translator to translate a particular expression that includes a particular translated expression into a particular translated expression that includes a particular canonical representation. |
地址 |
Mountain View CA US |