发明名称 Automatic generation of stop word lists for information retrieval and analysis
摘要 Methods and systems for automatically generating lists of stop words for information retrieval and analysis. Generation of the stop words can include providing a corpus of documents and a plurality of keywords. From the corpus of documents, a term list of all terms is constructed and both a keyword adjacency frequency and a keyword frequency are determined. If a ratio of the keyword adjacency frequency to the keyword frequency for a particular term on the term list is less than a predetermined value, then that term is excluded from the term list. The resulting term list is truncated based on predetermined criteria to form a stop word list.
申请公布号 US8352469(B2) 申请公布日期 2013.01.08
申请号 US20090555962 申请日期 2009.09.09
申请人 BATTELLE MEMORIAL INSTITUTE;ROSE STUART J 发明人 ROSE STUART J
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址