发明名称 DISTRIBUTED METHOD FOR INTEGRATING DATA MINING AND TEXT CATEGORIZATION TECHNIQUES
摘要 A method for prediction analysis using text categorization is provided. The method includes the steps of: grouping a plurality of text documents into a plurality of classes; selecting a top m most discriminatory terms for each class of documents using statistical based measures; determining for each document the presence or absence of each of the discriminatory terms, learning rule-based models of each class of documents using a rule learning algorithm; determining, for at least a portion of the plurality of documents, if a given learned rule has been satisfied by each respective document; creating a database of the rules associated with documents satisfying the rules; and performing distributed data mining to form a predictive result based on at least a portion of the plurality of documents.
申请公布号 WO2008042264(A3) 申请公布日期 2008.07.24
申请号 WO2007US20938 申请日期 2007.09.28
申请人 INFERX CORPORATION;HADJARIAN, ALI 发明人 HADJARIAN, ALI
分类号 G06E1/00 主分类号 G06E1/00
代理机构 代理人
主权项
地址