发明名称 METHOD AND SYSTEM FOR SIMPLIFYING IMPLICIT RHETORICAL RELATION PREDICTION IN LARGE SCALE ANNOTATED CORPUS
摘要 <p>The present invention provides a method and system directed to predicting implicit rhetorical relations between two spans of text, e.g., in a large annotated corpus, such as the Penn Discourse Treebank ("PDTB"), Rhetorical Structure Theory corpus, and the Discourse Graph Bank, and particularly directed to determining a rhetorical relation in the absence of an explicit discourse marker. Surface level features may be used to capture pragmatic information encoded in the absent marker. In one manner a simplified feature set based only on raw text and semantic dependencies is used to improve performance for all relations. By using surface level features to predict implicit rhetorical relations for the large annotated corpus the invention approaches a theoretical maximum performance, suggesting that more data will not necessarily improve performance based on these and similarly situated features.</p>
申请公布号 CA2917153(A1) 申请公布日期 2015.01.08
申请号 CA20142917153 申请日期 2014.07.03
申请人 THOMSON REUTERS GLOBAL RESOURCES 发明人 HOWALD, BLAKE;NYSTROM, ANDREW
分类号 G06F17/27;G06F17/30 主分类号 G06F17/27
代理机构 代理人
主权项
地址