发明名称 SYSTEM FOR IDENTIFYING COMMON DIGITAL SEQUENCES
摘要 A system and method for unorchestrated determination of data sequences using "sticky byte" factoring to determine breakpoints in digital sequences such that common sequences can be identified. Sticky byte factoring provides an efficient method of dividing a data set into pieces that generally yields near optimal commonality. This is effectuated by employing a rolling hashsum and, in an exemplary embodiment disclosed herein, a threshold function to deterministically set divisions in a sequence of data. Both the rolling hash and the threshold function are designed to require minimal computation. This low overhead makes it possible to rapidly partition a data sequence for presentation to a factoring engine or other applications that prefer subsequent synchronization across the data set.
申请公布号 WO0237689(A1) 申请公布日期 2002.05.10
申请号 WO2001US31306 申请日期 2001.10.04
申请人 AVAMAR TECHNOLOGIES, INC.;MOULTON, GREGORY, HAGAN 发明人 MOULTON, GREGORY, HAGAN
分类号 G06F12/00;G06F7/00;G06F7/06;G06F7/22;G06F17/30;H03M7/30;H04L23/00 主分类号 G06F12/00
代理机构 代理人
主权项
地址