摘要 |
A method and system for determining a lexical association of phrasal terms are described. A corpus having a plurality of words is received, and a plurality of contexts including one or more context words proximate to a word in the corpus is determined. An occurrence count for each context is determined, and a global rank is assigned based on the occurrence count. Similarly, a number of occurrences of a word being used in a context is determined, and a local rank is assigned to the word-context pair based on the number of occurrences. A rank ratio is then determined for each word-context pair. A rank ratio is equal to the global rank divided by the local rank for a word-context pair. A mutual rank ratio is determined by multiplying the rank ratios corresponding to a phrase. The mutual rank ratio is used to identify phrasal terms in the corpus. |