发明名称 Building Entity Relationship Networks from n-ary Relative Neighborhood Trees
摘要 Entities are objects with feature values that can be thought of as vectors in N-space, where N is the number of features. Similarity between any two entities can be calculated as a distance between the two entity vectors. A similarity network can be drawn between a set of entities based on connecting two entities that are relatively near to each other in N-space. Binary relative neighborhood trees are a special type of entity relationship network, designed to be useful in visualizing the entity space. They have the intuitively simple property that the more typical entities occur at the top of the tree and the more unusual entities occur at the leaf nodes. By limiting the number of links to n+1 per node (one parent, n children), a regularized flat tree structure is created that is much easier to visualize and navigate at both a course and a fine level by domain experts.
申请公布号 US2015324481(A1) 申请公布日期 2015.11.12
申请号 US201414270613 申请日期 2014.05.06
申请人 International Business Machines Corporation 发明人 Spangler W Scott
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer-implemented method comprising: receiving: (a) a target set of entities, E, (b) a set of features, F, describing entities in E, and (c) a maximum number of allowed children, n, where n>1; computing, across entities in E and features in F, a set of feature vectors comprising a feature vector for each entity in E; computing an average feature vector, A, of said set of feature vectors; identifying a root entity in E whose feature vector distance from A is smallest and assigning it as a root node in a candidate set C representing a tree of nodes; identifying another entity in E whose feature vector distance from an existing node in C is smallest and adding it as a child to that existing node when it has no more than n children, otherwise, adding it to another existing node without n children with whom its feature vector distance is smallest, where this step is repeated until all entities in E are added as children of existing nodes in C; and outputting a nodal representation of said tree.
地址 Armonk NY US
您可能感兴趣的专利