主权项 |
1. A computer-implemented method comprising:
receiving: (a) a target set of entities, E, (b) a set of features, F, describing entities in E, and (c) a maximum number of allowed children, n, where n>1; computing, across entities in E and features in F, a set of feature vectors comprising a feature vector for each entity in E; computing an average feature vector, A, of said set of feature vectors; identifying a root entity in E whose feature vector distance from A is smallest and assigning it as a root node in a candidate set C representing a tree of nodes; identifying another entity in E whose feature vector distance from an existing node in C is smallest and adding it as a child to that existing node when it has no more than n children, otherwise, adding it to another existing node without n children with whom its feature vector distance is smallest, where this step is repeated until all entities in E are added as children of existing nodes in C; and outputting a nodal representation of said tree. |