摘要 |
A computerized method of representing a dataset with a taxonomy includes obtaining a dataset comprising a plurality of records, the dataset being characterized by a vocabulary and each of the plurality of records being characterized by at least one term within the vocabulary; identifying nearest neighbors for each term within the vocabulary; imputing a degree of membership for each nearest neighbor identified for each term within the vocabulary; augmenting the obtained dataset with the imputed degree of membership; and generating a taxonomy of the augmented dataset.
|