发明名称 METHOD OF DETECTING INFORMATION ON DIFFERENCE OF TEXT DATABASE CONTENT
摘要 PROBLEM TO BE SOLVED: To provide a method of detecting information on the difference of text database contents for selectively extracting contents of high novelty (groups of documents) while using only an inquiry interface based on a normal keyword provided by a text database. SOLUTION: A keyword word is selected and an inquiry is made to the text database to obtain n1 sample documents each containing the keyword. A classifier is created which, with each of the previously obtained n1 sample documents as a normal example, determines how much a given text database resembles the normal example, and which also determines if the text database belongs to the same class as the group of documents given as the normal example. Thereafter, a word to be the keyword is selected again and an inquiry is made to the text database. The candidate documents obtained are subjected to the classifier and only documents determined not to belong to the same class as the normal example are regarded as extracted documents. This operation is repeated until the number of extracted documents reaches a predetermined number of n2. COPYRIGHT: (C)2005,JPO&NCIPI
申请公布号 JP2004348393(A) 申请公布日期 2004.12.09
申请号 JP20030144119 申请日期 2003.05.21
申请人 JAPAN SCIENCE & TECHNOLOGY AGENCY 发明人 KITAGAWA HIROYUKI;MORI TAKANORI
分类号 G06F17/30;G06F12/00;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址