发明名称 CATEGORY BASED, EXTENSIBLE AND INTERACTIVE SYSTEM FOR DOCUMENT RETRIEVAL
摘要 An integrated, automatic and open information retrieval system (100) comprises an hybrid method based on linguistic and mathematical approaches for an automatic text categorization. It solves the problems of conventional systems by combining an automatic content recognition technique with a self-learning hierarchical scheme of indexed categories. In response to a word submitted by a requestor, said system (100) retrieves documents containing that word, analyzes the documents to determine their word-pair patterns, matches the document patterns to database patterns that are related to topics, and thereby assigns topics to each document. If the retrieved documents are assigned to more than one topic, a list of the document topics is presented to the requestor, and the requestor designates the relevant topics. The requestor is then granted access only to documents assigned to relevant topics. A knowledge database (1408) linking search terms to documents and documents to topics is established and maintained to speed future searches. Additionally, new strategies are presented to deal with different update frequencies of changed Web sites.
申请公布号 WO03005235(A1) 申请公布日期 2003.01.16
申请号 WO2001EP07649 申请日期 2001.07.04
申请人 COGISUM INTERMEDIA AG;MEIK, FRANK;WIELSCH, MICHAEL 发明人 MEIK, FRANK;WIELSCH, MICHAEL
分类号 G06F17/30;H04L12/56;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址