主权项 |
1. A computer program product for identifying word-senses, the method comprising:
a computer-readable storage medium having program code embodied therewith, the program code executable by a processor of a computer to perform a method comprising: generating, by a computer, a plurality of arrays of aggregated statistical information of words, their corresponding word-senses, and temporal properties within different professional fields using an n-gram viewer, wherein the aggregated statistical information comprises frequency of usage of words, frequency of occurrence of words, frequency of co-occurrence of words with other words, and their respective corresponding word-senses; generating, by the computer, a set of domain tables based on the generated plurality of arrays of aggregated statistical information, wherein each of the domain tables within the set of domain tables corresponds to a different professional field comprising medical, veterinary, legal, and engineering; receiving, from a remote server through a network, a digital text stream comprising metadata and one or more words from a doctor, using the computer, the network being an internet connection; selecting, using the metadata, a medical frequency domain table, veterinary frequency domain table, and a word-sense domain table from the set of domain tables; determining a frequency of occurrence value for the received digital text stream within each of the selected domain tables; receiving a threshold from the doctor; associating the medical frequency domain table with the received digital text stream in response to the frequency of occurrence value satisfying the received threshold; determining a word-sense of the received digital text stream, by determining a corresponding word sense to the received digital text stream within the medical frequency domain table; assigning a confidence value to the word-sense based on a degree of frequency of occurrence of the received digital text stream within the medical domain, wherein the word-sense has a higher confidence value, when the frequency of occurrence of the received digital text stream is higher within the medical domain table; and presenting the word-sense and the confidence value to the doctor. |