发明名称 SYSTEM, METHOD AND COMPUTER EXECUTABLE PROGRAM FOR INFORMATION TRACKING FROM HETEROGENEOUS SOURCE
摘要 PROBLEM TO BE SOLVED: To provide a system, a method and a program for information tracking from heterogeneous information sources. SOLUTION: An information clustering system 100 comprises a data accumulation part 102 for accumulating documents in a document storage part, the documents including loosely correlated clusters between the documents and being time sliced; a vector space generation part 104 for generating document-keyword vectors, the document-keyword vectors comprising sparse numeral values depending on presence of keywords in the documents; a dimension reduction part 106 for reducing dimensions of the keywords to create a dimension reduction matrix of the document-keyword matrix; a centroid vector determination part 108 for generating a centroid vector of the cluster, the cluster being retrieved from the document-keyword vector using a principal component in a same line of the dimension reduction matrix, the centroid vectors being defined from keywords and weight of documents within the cluster; and an item storage part 112 for storing the keywords and the weights of the centroid vector. COPYRIGHT: (C)2008,JPO&INPIT
申请公布号 JP2008181205(A) 申请公布日期 2008.08.07
申请号 JP20070012618 申请日期 2007.01.23
申请人 INTERNATL BUSINESS MACH CORP <IBM> 发明人 KOBAYASHI MEI;REIRIN KEI YUN
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址