发明名称 AUTOMATED IDENTIFICATION OF RECURRING TEXT
摘要 In embodiments, one or more computer-readable media may have instructions stored thereon which, when executed by a processor of a computing device, provide the computing device with a recurring text identification service. The recurring text identification service may be configured, in some embodiments, to receive a request to identify recurring text within a plurality of documents. The recurring text identification service may be further configured to analyze individual segments of the plurality of documents to generate segment identifiers respectively associated with the segments. In embodiments, the segment identifiers may be based on content of the segments. In embodiments, segments with the same content may have equivalent segment identifiers. The recurring text identification service may further be configured to generate a distribution of the segment identifiers and may enable the distribution of segment identifiers to be used to streamline identification of recurring text within the plurality of documents.
申请公布号 US2015066976(A1) 申请公布日期 2015.03.05
申请号 US201314072595 申请日期 2013.11.05
申请人 Lighthouse Document Technologies, Inc. (d/b/a Lighthouse eDiscovery) 发明人 Dahl Christopher;Belger Geoffrey Alan David
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. One or more computer-readable media having instructions stored thereon which, when executed by a processor of a computing device, cause the computing device to provide a recurring text identification service configured to: receive a request to identify recurring text within a plurality of documents; analyze individual segments of the plurality of documents to generate segment identifiers respectively associated with the segments, wherein the segment identifiers are based at least in part on content of the segments, and wherein segments with the same content have equivalent segment identifiers; generate a distribution of the segment identifiers; and enable the distribution of segment identifiers to be used to streamline identification of recurring text within the plurality of documents.
地址 Seattle WA US