发明名称 SYSTEM AND METHOD FOR IDENTIFYING UNIQUE MESSAGES STORED IN MULTIPLE MESSAGE STORES
摘要 <p>A system (10) and method (100) for efficiently processing messages (70) stored in multiple message stores (41) is described. Metadata (35) identifying a range of topically identical messages (47) extracted from a plurality of message stores (41) storing a multiplicity of messages (70) to be processed is iteratively copied. The metadata (35) for the extracted range of topically identical messages (47) is categorized. Those messages (70) containing substantially duplicative content within the extracted range are identified as duplicate messages (47). Those non-duplicate messages (44) within the extracted range are tallied into an ordering of conversation thread length (46). Those messages (70) whose content is recursively-included content (72, 73) within another of the tallied non-duplicate messages (44) are classified as near-duplicate messages (45). The remaining messages (71) are designated as unique messages (44) containing substantially non-duplicative content (71).</p>
申请公布号 WO2002091701(A2) 申请公布日期 2002.11.14
申请号 US2002008471 申请日期 2002.03.19
申请人 发明人
分类号 主分类号
代理机构 代理人
主权项
地址