主权项 |
1. A computer implemented method for using recognized phrasal terms to automatically determine meaning of a string of text where that determined meaning differs from a meaning of component words of the string of text in a computer system, the method comprising:
identifying, using a computer processing system, a plurality of phrasal terms from a corpus of written materials; ranking, using the computer processing system, the plurality of phrasal terms; selecting, using the computer processing system, each of the plurality of phrasal terms sequentially based on the raking; and for each selected phrasal term, determining, using the computer processing system, whether to accept or reject the selected phrasal term based on:
i. rejecting the selected phrasal term if any term within the selected phrasal term is non-alphabetic, unless the term is non-alphabetic as a result of a hyphen or apostrophe;ii. rejecting the selected phrasal term if the selected phrasal term is a variation of a previously accepted phrasal term, wherein the variation is a plural form, a hyphenated form, or a closed compound form;iii. replacing a previously accepted phrasal term with the selected phrasal term if the previously accepted phrasal term is a component of the selected phrasal term and the frequency of the previously accepted phrasal term and the frequency of the selected phrasal term are the same; andiv. rejecting the selected phrasal term if the selected phrasal term is a component of a previously accepted phrasal term and the frequency of the previously accepted phrasal term and the frequency of the selected phrasal term are the same; using a list of remaining phrasal terms to automatically determine a meaning of a received string of text using the computer processing system, where that determined meaning differs from a meaning of component words of the string of text. |