发明名称 A METHOD FOR THE EXTRACTION OF RELATION PATTERNS FROM ARTICLES
摘要 A method for building a knowledge base containing entailment relations, therefore comprises the steps of: a) providing at least one input pattern ( p ) with N pattern slots (N>1), said input pattern ( p ) expressing a specific semantic relation between N entities that fill the N pattern slots of the input pattern ( p ) as slot fillers, b) providing at least one cluster ( c ) of articles, said articles of said cluster ( c ) relating to a common main topic; c) processing said articles with respect to the input pattern ( p ) and identifying the identities which match the semantic type of the N pattern slots; d) if said at least one input pattern matches a portion of an article ( a ) of said at least one cluster ( c ): i) storing the N slot fillers (s 1 , s 2 , ... , s N ), which match the slots of the pattern (p), and a cluster identifier lc of the cluster ( c ) into a first table S , wherein the N-tuple (s 1 , s 2 , ... , s N ) and the cluster identifier l c of the associated cluster ( c ) form one element of said table S ; ii) for each element of table S, identifying appearances of the slot fillers (s 1 , s 2 , ... , s N ) in a plurality of articles of cluster ( c ) and for each appearance so identified, storing the slot fillers (s 1 , s 2 , ... , s N ) together with the sentence in which they occur into a second table C 0 ; iii) from the sentences stored in table C 0 , extracting patterns which span over the corresponding N slot fillers (s 1 , s 2 , ... , s N ), said extracted pattern expressing a semantic relation between said N slot fillers; and iv) storing said extracted pattern together with said input pattern as entailment relation into said knowledge base.
申请公布号 EP2137638(A1) 申请公布日期 2009.12.30
申请号 EP20080736245 申请日期 2008.04.15
申请人 THE EUROPEAN COMMUNITY, REPRESENTED BY THE EUROPEAN COMMISSION 发明人 TANEV, HRISTO TANEV
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址