发明名称 Systems and methods for extracting patterns from graph and unstructured data
摘要 A computing system receives input data having both graph and unstructured data and computes a current log likelihood of the input data. The computing system compares the current log likelihood with a previous log likelihood of the input data. If the current log likelihood is larger than the previous log likelihood, the computing system update topic modeling parameters, community modeling parameters, and the link generation parameter until the computing system obtains a maximal value of the log likelihood of the input data. Then, the computing system creates a graph indicating topic similarity between the input data based on the topic modeling parameters, creates another graph indicating community similarity between entities associated with the input data based on the community modeling parameters, and predicts a link existence between input data or entities based on the link generation parameter, the topic modeling parameter and the community modeling parameter.
申请公布号 US8756053(B2) 申请公布日期 2014.06.17
申请号 US201213606810 申请日期 2012.09.07
申请人 International Business Machines Corporation 发明人 Gryc Wojciech;Lawrence Richard;Liu Yan
分类号 G06F17/27 主分类号 G06F17/27
代理机构 Scully, Scott, Murphy & Presser, P.C. 代理人 Scully, Scott, Murphy & Presser, P.C. ;Morris, Esq. Daniel P.
主权项 1. A method implemented in a computer system for discovering a relationship between entities comprising: receiving, at said computer system, input data W representing a word vector matrix and input data G representing a link graph matrix having values that capture relationships amongst user or business entities; computing a current log likelihood of the input data W and G, the current likelihood of the input data being a probability distribution function of topic modeling parameters and community modeling parameters, the topic modeling parameters representing topic similarity between unstructured texts of the entities and the community modeling parameters represent community similarity between the entities comparing the current log likelihood of the input data and a previous log likelihood of the input data computed previously; updating values of the parameters, if the current log likelihood is larger than the previous log likelihood; repeating the comparing and the updating until the current log likelihood becomes less than or equal to the previous log likelihood; and constructing at least one graph based on the updated values of the parameters when the current log likelihood is less than or equal to the previous log likelihood, the at least one graph indicating the relationship between the entities, wherein a program using a processor unit executes one or more of said computing comparing, updating, and constructing.
地址 Armonk NY US