发明名称 Computer aided document retrieval
摘要 A method of determining cluster attractors for a plurality of documents comprising at least one term is disclosed. The method comprises calculating a probability distribution indicative of the frequency of occurrence of the, or each, other term that co-occurs with the term in at least one of the documents for each term. Then, the entropy of the respective probability distribution is calculated. Finally, at least one of the probability distributions is selected as a cluster attractor depending on the respective entropy value. This above described method is used in a method of clustering a plurality of documents. Once the cluster attractions are determined, each document is compared with each cluster attractor and each document is assigned to one or more cluster attractors depending on the similarity between the document and the cluster attractors.
申请公布号 NZ546763(A) 申请公布日期 2008.03.28
申请号 NZ20040546763 申请日期 2004.09.27
申请人 UNIVERSITY OF ULSTER;ST. PETERSBURG STATE UNIVERSITY 发明人 PATTERSON, DAVID;DOBRYNIN, VLADIMIR
分类号 G06F17/30;G06K9/62;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址