发明名称 SIMILAR DOCUMENT DETECTION AND ELECTRONIC DISCOVERY
摘要 Systems and methods are disclosed for performing duplicate document analyses to identify texturally identical or similar documents, which may be electronic documents stored within an electronic discovery platform. A process is described which includes representing each of the documents, including a target document, as a relatively large n-tuple vector and also as a relatively small m-tuple vector, performing a series of one-dimensional searches on the set of m-tuple vectors to identify a set of documents which are near-duplicates to the target document, and then filtering the near set of near duplicate documents based upon the distance of their n-tuple vectors from that of the target document.
申请公布号 US2013212090(A1) 申请公布日期 2013.08.15
申请号 US201313763253 申请日期 2013.02.08
申请人 STROZ FRIEDBERG, LLC;STROZ FRIEDBERG, LLC 发明人 SPERLING MICHAEL;JIN RONG;RAYVYCH ILLYA;LI JIANGHONG;YI JINFENG
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址