发明名称 METHOD AND APPARATUS FOR PARAPHRASE ACQUISITION
摘要 <p>A computer based natural language processing method for identifying paraphrases in corpora using statistical analysis comprises deriving a set of starting paraphrases (SPs) from a parallel corpus, each SP having at least two phrases that are phrase aligned; generating a set of paraphrase patterns (PPs) by identifying shared terms within two aligned phrases of an SP, and defining a PP having slots in place of the shared terms, in right hand side (RHS) and left hand side (LHS) expressions; and collecting output paraphrases (OPs) by identifying instances of the PPs in a non-parallel corpus. By using the reliably derived paraphrase information from a small parallel corpus to generate the PPs, and extending the range of instances of the PPs over the large non-parallel corpus, better coverage of the paraphrases in the language and fewer errors are encountered.</p>
申请公布号 CA2793268(A1) 申请公布日期 2013.04.21
申请号 CA20122793268 申请日期 2012.10.19
申请人 NATIONAL RESEARCH COUNCIL OF CANADA 发明人 FUJITA, ATSUSHI;ISABELLE, PIERRE
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项
地址