发明名称 |
Cooccurrence and constructions |
摘要 |
A method and system for performing automatic text analysis is described. A local ranking for one or more contexts with respect to a word and a global ranking for one or more contexts are generated. The rankings are based on the frequency with which the contexts appear in a corpus. A statistic may be generated using the local and global rankings, such as a log ratio rank statistic equal to the logarithm of the global rank divided by local rank, to measure the similarity of contexts with respect to words with which they combine. A source matrix of word to context values is then created. Singular value decomposition is used to create sub-matrices from the source matrix. Vectors from the sub-matrices corresponding to context(s) and/or word(s) are then selected to determine term-term or context-context similarity or term-context correspondence.
|
申请公布号 |
US7373102(B2) |
申请公布日期 |
2008.05.13 |
申请号 |
US20040915881 |
申请日期 |
2004.08.11 |
申请人 |
EDUCATIONAL TESTING SERVICE |
发明人 |
DEANE PAUL |
分类号 |
G06F17/27;G06F;G06F17/21;G09B11/00;G10L15/06;G10L15/18 |
主分类号 |
G06F17/27 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|