发明名称 Multi-source transfer of delexicalized dependency parsers
摘要 A source language sentence is tagged with non-lexical tags, such as part-of-speech tags and is parsed using a lexicalized parser trained in the source language. A target language sentence that is a translation of the source language sentence is tagged with non-lexical labels (e.g., part-of speech tags) and is parsed using a delexicalized parser that has been trained in the source language to produce k-best parses. The best parse is selected based on the parse's alignment with lexicalized parse of the source language sentence. The selected best parse can be used to update the parameter vector of a lexicalized parser for the target language.
申请公布号 US9305544(B1) 申请公布日期 2016.04.05
申请号 US201514594900 申请日期 2015.01.12
申请人 Google Inc. 发明人 Petrov Slav;McDonald Ryan;Hall Keith
分类号 G06F17/20;G06F17/28;G06F17/27;G06F17/21;G10L15/00;G10L17/00;G10L21/00;G10L15/06 主分类号 G06F17/20
代理机构 Middleton Reutlinger 代理人 Middleton Reutlinger
主权项 1. A computer implemented method, comprising: identifying a parse tree for a source-language sentence, the parse tree being generated based on tagging the source-language sentence and parsing the tagged source-language sentence using a source language lexicalized parser that has been trained using source-language Treebank data; tagging a target-language sentence with parts of speech tags, where the target-language sentence is a translation of the source-language sentence; parsing, utilizing one or more processors, the tagged target-language sentence with a delexicalized parser to generate a set of k-best parse trees for the target-language sentence, where the delexicalized parser has been trained using source-language Treebank data; selecting the best target-language parse tree of the k-best parse trees that most closely aligns with the parse tree of the source-language sentence; and updating a parameter vector of a target language lexicalized parser based upon the selected best target-language parse tree.
地址 Mountain View CA US