发明名称 Methods and apparatuses for information analysis on shared and distributed computing systems
摘要 Apparatuses and computer-implemented methods for analyzing, on shared and distributed computing systems, information comprising one or more documents are disclosed according to some aspects. In one embodiment, information analysis can comprise distributing one or more distinct sets of documents among each of a plurality of processes, wherein each process performs operations on a distinct set of documents substantially in parallel with other processes. Operations by each process can further comprise computing term statistics for terms contained in each distinct set of documents, thereby generating a local set of term statistics for each distinct set of documents. Still further, operations by each process can comprise contributing the local sets of term statistics to a global set of term statistics, and participating in generating a major term set from an assigned portion of a global vocabulary.
申请公布号 US7895210(B2) 申请公布日期 2011.02.22
申请号 US20060540240 申请日期 2006.09.29
申请人 BATTELLE MEMORIAL INSTITUTE 发明人 BOHN SHAWN J.;KRISHNAN MANOJ KUMAR;COWLEY WENDY E.;NIEPLOCHA JAREK
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址