发明名称 Automatic context sensitive language correction and enhancement using an internet corpus
摘要 A computer-assisted language correction system including spelling correction functionality, misused word correction functionality, grammar correction functionality and vocabulary enhancement functionality utilizing contextual feature-sequence functionality employing an internet corpus.
申请公布号 US8914278(B2) 申请公布日期 2014.12.16
申请号 US200812669175 申请日期 2008.07.31
申请人 Ginger Software, Inc. 发明人 Zangvil Yael Karov;Zangvil Avner
分类号 G06F17/27;G06F17/21 主分类号 G06F17/27
代理机构 Edwards Wildman Palmer LLP 代理人 Edwards Wildman Palmer LLP ;Kramer Barry;Jones Joshua L.
主权项 1. A computer-assisted language correction system comprising: a computer storage device, storing computer modules; a computer processor operative to execute said modules; said computer modules including: contextual feature-sequence (CFS) functionality operative to generate a plurality of contextual feature-sequences based on an input sentence, said contextual feature sequence comprising at least one of N-grams, skip-grams, switch-grams, co-occurrences, and combinations thereof; an alternatives generator, generating on the basis of said input sentence a text-based representation providing multiple alternatives for each of a plurality of words in the sentence, said multiple alternatives including non-contextual corrections for each of said plurality of words; a selector for selecting among at least said multiple alternatives for each of said plurality of words in the sentence, said selector including context based scoring functionality operative to rank said multiple alternatives, based at least partly on contextual feature-sequence frequencies of occurrences in an internet corpus for each of the plurality of contextual feature-sequences, said context based scoring functionality including ranking said multiple alternatives based at least partially on a CFS importance score, wherein the CFS importance score is a function of a combination of: a) a number of parsing tree nodes that correspond to a same part of the CFS, and b) a frequency of occurrence of each of the words in the CFS; and a correction generator operative to provide a correction output based on selections made by said selector.
地址 Lexington MA US