发明名称 A cluster-based method and system for browsing large document collections
摘要 <p>Scatter-Gather is a computer based document browsing method which operates in time proportional to a number of documents in a target corpus. The Scatter-Gather method includes: preparing an initial ordering of the corpus using, for example, an off-line computational method; determining a summary of the initial ordering of the corpus for interactive utility; and providing a further ordering of the corpus using, for example, an on-line non-deterministic method. The step of an off-line preparation of an initial ordering of a corpus is non- time-dependent, thus an accurate initial ordering is prepared. The step of determining a summary includes determining a summary for presentation to a user without scrolling on a CRT. The step of providing a further ordering includes truncated group average agglomerate clustering, merging disjointed document sets, center finding, assign-to-nearest and other refinement methods. <IMAGE> <IMAGE></p>
申请公布号 EP0542429(B1) 申请公布日期 2000.05.31
申请号 EP19920309402 申请日期 1992.10.15
申请人 XEROX CORPORATION 发明人 PEDERSEN, JAN O.;TUKEY, JOHN W.;KARGER, DAVID;CUTTING, DOUGLASS R.
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址