发明名称 Deduplicating messages for improving message sampling quality
摘要 A system and method for deduplicating messages is provided. Duplicate copies of messages are excluded from a set of deduplicated messages. The set of deduplicated messages can then be sampled to obtain a sample set usable for ensuring compliance according to a set of rules. One method for deduplicating messages involves receiving a message, determining whether the message is a duplicate copy, and adding the message to the set of deduplicated messages, if it is determined that the message is not a duplicate copy.
申请公布号 US9164687(B1) 申请公布日期 2015.10.20
申请号 US201113006909 申请日期 2011.01.14
申请人 Symantec Corporation 发明人 Bhatt Neel Atulkumar;Panse Sunil Sharad;Gupta Chirag;Barman Siddharth Ranoj;Hundekar Shankar Nabhaji
分类号 G06F15/16;G06F17/00;G06F3/06 主分类号 G06F15/16
代理机构 Campbell Stephenson LLP 代理人 Campbell Stephenson LLP
主权项 1. A method comprising: receiving a first message, wherein the first message is received by a computing device, andthe computing device implements a compliance product; storing the first message at the computing device, wherein the first message is stored as part of a review set, andthe review set includes duplicate copies of messages; determining if the first message is a duplicate copy of any other message in the review set; if the first message is not a duplicate copy, adding the first message to a set of deduplicated messages, wherein the determining and the adding are performed by the compliance product, andthe set of deduplicated messages excludes the duplicate copies of messages from the review set; if the first message is a duplicate copy, excluding the first message from the set of deduplicated messages; sampling the set of deduplicated messages, wherein the sampling is performed by the compliance product,the sampling results in a sample set comprising a first subset of messages that are each associated with a unique hash value, anda second subset of messages that have been flagged as potentially problematic, andthe sample set represents a percentage of all messages in the set of deduplicated messages within a given time frame, wherein the percentage is determined by a set of federal regulations regarding review of electronic messages, andthe federal regulations are applicable to the all messages in the set of deduplicated messages; and determining whether each message in the sample set complies with the set of federal regulations.
地址 Mountain View CA US