摘要 |
A system and method for discovering terminology unique to a distinct subset of a general population, is provided. Terminology comprised of new terms as well as unique and obscure usages of previously known terms is determined by creating a common language usage (CLU) dictionary comprising terms, definitions corresponding to term usages, and frequencies corresponding to term usage, from a collection of documents intended for a general audience. In a similar manner, a group dictionary is prepared for a distinct subset and both dictionaries are subsequently compared to determine: the existence of terms not shared in common; differences in usage of terms shared in common; and disparities in frequencies of usages of terms shared in common. Such a comparison highlights differences between communications of a general population and a distinct subset, as well as serves to establish terminology that is unique to a particular, distinct subset of the general population.
|