发明名称 Generating a subgraph of key entities in a network and categorizing the subgraph entities into different types using social network analysis
摘要 A method, system and computer-program product for generating a subgraph of key entities in a network and organizing entities in the subgraph are disclosed. The technique uses social network analysis centrality metrics to identify key entities in a network. The technique also uses social network analysis centrality metrics to categorize key entities into different types.
申请公布号 US9195962(B2) 申请公布日期 2015.11.24
申请号 US201012877914 申请日期 2010.09.08
申请人 发明人 Topham Philip S.;Purushothaman Senthil Kumar;Peeler Ryan;Page Anthony M.;Vesely Daniel
分类号 G06F15/173;G06Q10/10;G06Q30/02 主分类号 G06F15/173
代理机构 Nguyen & Tarbet Patent Law Firm 代理人 Nguyen & Tarbet Patent Law Firm
主权项 1. In a non-transitory computer-readable medium, a method of generating a subgraph of key entities in a network, and segmenting the subgraph of key entities of a network into sub-groups based on network centrality metrics, the method comprising: a. generating the subgraph of key entities, comprising: i. determining, by executing at least one computer program on one or more processors, at least two independent types of network centrality metrics for a plurality of entities in a giant component of the network; wherein the network is a publication-based network; wherein the entities are publication authors or institutions, and links between the entities in the network are based upon co-authorships, co-citations, cross-institutional relationships, or a combination thereof, from publication data;ii. selecting starting cut-off criteria for each network centrality metric;iii. assigning a plurality of rank-based scores to each entity, where each rank-based score corresponds to each network centrality metric value of each entity;iv. selecting some of the entities to form a tentative subgraph of key entities, wherein a sub graph of key entities is defined as a social network graph of a group of entities within the giant component of the network, where the selection of key entities is based on comparisons of the plurality of rank-based scores to the plurality of cut-off criteria, the selection process comprising: 1. identifying the entities whose score for a network centrality metric exceeds the cutoff criteria;2. merging the identified entities into the tentative sub graph; and3. iteratively repeating steps 1 and 2 of the selection process for each network centrality metricv. calculating a reach of the entities in the tentative sub graph of key entities and comparing the reach to a reach range, wherein the reach is defined as a fraction of entities in the giant component which are connected to the entities in the tentative sub graph by first-degree links, second-degree links or a combination thereof; wherein the first-degree and second-degree links are based upon co-authorships, co-citations, cross-institutional relationships, or a combination thereof, and where the reach range is defined by a minimum and a maximum reach threshold;vi. If the reach falls within the reach range, defined as falling above the minimum reach threshold and below the maximum reach threshold, assigning the entities in the tentative sub graph of key entities to the subgraph of key entities, wherein the subgraph of key entities represents a core group of key opinion leaders within a social network;vii. if the reach does not fall within the reach range, independently adjusting one or more of the cut-off criteria for each network centrality metric; andviii. iteratively repeating steps iii-vii of the generating the subgraph of key entities until the subgraph of key entities with a reach within the reach range is formed, and b. segmenting the subgraph into subgroups, comprising: i. calculating, by executing at least one computer program on one or more processors, at least two independent network centrality metric values for each entity in the sub graph of key entities; wherein the network centrality metrics quantitatively indicate independent characteristics of an entity in the network;ii. assigning a plurality of rank-based scores to each entity, where each rank-based score corresponds to each network centrality metric value of each entity;iii. defining a plurality of rank-based score ranges for each of the network centrality metrics,iv. defining a plurality of subgroups corresponding to combinations of the rank based score ranges; andv. assigning each entity of the sub graph of key entities to one or more subgroups, comprising: 1. for each rank based score range, determining the score range that the entity's rank-based scores falls into;2. for each subgroup, determining if the entity's rank-based scores fall within the score ranges corresponding to the subgroup; and3. assigning the entity to the corresponding subgroup;4. if the entity falls within more than one subgroup, removing the entity from one or more of the subgroups;vi. repeating steps iii-v of the segmenting the subgraph into subgroups until a predetermined number of entities is assigned to each subgroup.
地址