发明名称 MINING STRONG RELEVANCE BETWEEN HETEROGENEOUS ENTITIES FROM THEIR CO-OCURRENCES
摘要 Given two heterogeneous entities, the prevalence of text data provides rich co-occurrence information for them. However, the co-occurrence only is noisy—not only may the co-occurrence just imply an accidental writing, but also it may just reflect the domain-specific common words. Only those strong relevance between entities supported by rich relevance contexts in data can indicate meaningful entity relationships. Strong relevance between heterogeneous entities are mined from their co-occurrences. Drug-disease therapeutic relationships are used as the example to demonstrate an application of this work.
申请公布号 US2015332158(A1) 申请公布日期 2015.11.19
申请号 US201414279617 申请日期 2014.05.16
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 He Qi;Ji Ming;Spangler W. Scott
分类号 G06N7/00;G06F17/30;G06N99/00 主分类号 G06N7/00
代理机构 代理人
主权项 1. A computer-implemented method comprising: receiving data associated with a co-occurrence graph among heterogeneous entities, said co-occurrence graph comprising a plurality of nodes, each node representing an entity in said heterogeneous entities, wherein any two nodes in said co-occurrence graph are connected by an edge when they co-occur in a knowledge base, with a weight of said edge being equal to the number of times entities associated with said two nodes co-occur in said knowledge base; receiving a query comprising a query entity name and a target entity type; receiving a plurality of meta paths to constrain co-occurrence scope of any two heterogeneous entities in said co-occurrence graph; generating a subgraph of said co-occurrence graph with path instances of said received meta paths; and outputting entities from said subgraph belonging to said target entity type and having strong relevance with said query entity name based on a probabilistic context-aware relevance model, where said strong relevance is constrained by said received meta paths.
地址 ARMONK NY US