发明名称 Lemmatizing, stemming, and query expansion method and system
摘要 A method of stemming text and system therefore are described. The method comprises removing stop words from a document based on at least one stop word entry in an array of stop words and flagging as nouns words determined to be attached to definite articles and preceded by a noun array entry in an array of stop words preceding at least one noun; adding flagged nouns to a noun dictionary; flagging as verbs words determined to be preceded by an verb array entry in an array of stop words preceding at least one verb; adding flagged verbs to a verb dictionary; searching the document for nouns and verbs based on the flagged nouns and the flagged verbs; removing remaining stop words subsequent to searching the document; applying light stemming on the flagged nouns; applying a root-based stemming on the flagged verbs; and storing the stemmed document.
申请公布号 US8473279(B2) 申请公布日期 2013.06.25
申请号 US20090476238 申请日期 2009.06.01
申请人 AL-SHAMMARI EIMAN TAMAH 发明人 AL-SHAMMARI EIMAN TAMAH
分类号 G06F17/20;G06F17/21;G06F17/27;G06F17/28 主分类号 G06F17/20
代理机构 代理人
主权项
地址