发明名称 Systems and methods for improving feature ranking using phrasal compensation and acronym detection
摘要 Systems and methods are disclosed for analyzing a set of documents by building a positive set histogram; selecting phrases from the positive set histogram; modifying the frequency statistics in the histogram using the selected phrases; identifying one or more potential phrase-acronym pairs; selecting a subset of phrase-acronym pairs from the potential pairs; adding a new feature for each selected phrase-acronym (phrase ∥ acronym) pair to a positive set histogram; determining a value for each new feature; identifying one or more child concepts based on an updated histogram; grouping the one or more child concepts; and determining a child concept group coverage for one or more documents.
申请公布号 US2005114130(A1) 申请公布日期 2005.05.26
申请号 US20040888419 申请日期 2004.07.09
申请人 NEC LABORATORIES AMERICA, INC. 发明人 JAVA AKSHAY;KLOCK BRIAN;GLOVER ERIC J.;SHANBHAG VISHAL;KROVETZ ROBERT
分类号 G06F17/30;G10L15/12;(IPC1-7):G10L15/12 主分类号 G06F17/30
代理机构 代理人
主权项
地址