发明名称 Computer-implemented systems and methods for non-monotonic recognition of phrasal terms
摘要 Systems and methods are provided for non-monotonic recognition of phrasal terms. Phrasal terms are identified from a corpus of written materials and ranked based on, for example, a mutual rank ratio. The phrasal terms are sequentially selected and a determination is made as to whether to accept or reject the selected phrasal term based on at least one predetermined criteria. The ranking of the phrasal terms may also rely on linguistic support to reduce duplication of phrasal terms and to distinguish different confidence levels for identified and accepted phrasal terms.
申请公布号 US9208145(B2) 申请公布日期 2015.12.08
申请号 US201313827174 申请日期 2013.03.14
申请人 Educational Testing Service 发明人 Krovetz Robert;Deane Paul
分类号 G06F17/21;G06F17/27 主分类号 G06F17/21
代理机构 Jones Day 代理人 Jones Day
主权项 1. A computer implemented method for using recognized phrasal terms to automatically determine meaning of a string of text where that determined meaning differs from a meaning of component words of the string of text in a computer system, the method comprising: identifying, using a computer processing system, a plurality of phrasal terms from a corpus of written materials; ranking, using the computer processing system, the plurality of phrasal terms; selecting, using the computer processing system, each of the plurality of phrasal terms sequentially based on the raking; and for each selected phrasal term, determining, using the computer processing system, whether to accept or reject the selected phrasal term based on: i. rejecting the selected phrasal term if any term within the selected phrasal term is non-alphabetic, unless the term is non-alphabetic as a result of a hyphen or apostrophe;ii. rejecting the selected phrasal term if the selected phrasal term is a variation of a previously accepted phrasal term, wherein the variation is a plural form, a hyphenated form, or a closed compound form;iii. replacing a previously accepted phrasal term with the selected phrasal term if the previously accepted phrasal term is a component of the selected phrasal term and the frequency of the previously accepted phrasal term and the frequency of the selected phrasal term are the same; andiv. rejecting the selected phrasal term if the selected phrasal term is a component of a previously accepted phrasal term and the frequency of the previously accepted phrasal term and the frequency of the selected phrasal term are the same; using a list of remaining phrasal terms to automatically determine a meaning of a received string of text using the computer processing system, where that determined meaning differs from a meaning of component words of the string of text.
地址 Princeton NJ US