发明名称 CATEGORY BASED, EXTENSIBLE AND INTERACTIVE SYSTEM FOR DOCUMENT RETRIEVAL
摘要 <p>An integrated, automatic and open information retrieval system (100) comprises an hybrid method based on linguistic and mathematical approaches for an automatic text categorization. It solves the problems of conventional systems by combining an automatic content recognition technique with a self-learning hierarchical scheme of indexed categories. In response to a word submitted by a requestor, said system (100) retrieves documents containing that word, analyzes the documents to determine their word-pair patterns, matches the document patterns to database patterns that are related to topics, and thereby assigns topics to each document. If the retrieved documents are assigned to more than one topic, a list of the document topics is presented to the requestor, and the requestor designates the relevant topics. The requestor is then granted access only to documents assigned to relevant topics. A knowledge database (1408) linking search terms to documents and documents to topics is established and maintained to speed future searches. Additionally, new strategies are presented to deal with different update frequencies of changed Web sites.</p>
申请公布号 WO2003005235(A1) 申请公布日期 2003.01.16
申请号 EP2001007649 申请日期 2001.07.04
申请人 发明人
分类号 主分类号
代理机构 代理人
主权项
地址