发明名称 METHOD AND APPARATUS FOR PARAPHRASE ACQUISITION
摘要 A computer based natural language processing method for identifying paraphrases in corpora using statistical analysis comprises deriving a set of starting paraphrases (SPs) from a parallel corpus, each SP having at least two phrases that are phrase aligned; generating a set of paraphrase patterns (PPs) by identifying shared terms within two aligned phrases of an SP, and defining a PP having slots in place of the shared terms, in right hand side (RHS) and left hand side (LHS) expressions; and collecting output paraphrases (OPs) by identifying instances of the PPs in a non-parallel corpus. By using the reliably derived paraphrase information from a small parallel corpus to generate the PPs, and extending the range of instances of the PPs over the large non-parallel corpus, better coverage of the paraphrases in the language and fewer errors are encountered.
申请公布号 US2013103390(A1) 申请公布日期 2013.04.25
申请号 US201213655852 申请日期 2012.10.19
申请人 FUJITA ATSUSHI;ISABELLE PIERRE 发明人 FUJITA ATSUSHI;ISABELLE PIERRE
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址