发明名称 Configurable data generator
摘要 Embodiments associated with configurable, repeatable, data generation are described. One example method includes manipulating a redundancy parameter that controls data redundancy in binary large objects (BLOBs) to be included in a generated data set. The redundancy parameters may control variations in repeatable variable length sequences included in BLOBs. The example method also includes manipulating a parameter(s) that controls custom designed sequences included in BLOBs. With the redundancy and custom designed sequences described, the example method then generates BLOBs based, at least in part, on the redundancy parameters and the custom-designed sequences. BLOBs may include byte sequences repeated at different frequencies and configurable user-designed sequences. Manipulating the redundancy parameter, manipulating the custom-designed sequences, generating the BLOBs, and providing the BLOBS may be performed by separate processes acting in parallel.
申请公布号 US8983916(B2) 申请公布日期 2015.03.17
申请号 US201213524450 申请日期 2012.06.15
申请人 发明人 Stoakes Timothy;Jones Craig Edward
分类号 G06F17/30;G06F3/06 主分类号 G06F17/30
代理机构 代理人
主权项 1. A non-transitory computer-readable medium storing computer-executable instructions that when executed by a computer cause the computer to perform a method, the method comprising: manipulating one or more redundancy parameters that control redundancy in data to be generated, where manipulating the one or more redundancy parameters includes manipulating a degree of internal redundancy for a subset of the data, a degree of external redundancy between subsets of the data, a frequency with which internal redundancy is to vary, or a frequency with which external redundancy is to vary; manipulating one or more parameters that control custom-designed sequences to be included in the data, where manipulating the one or more parameters that control custom-designed sequences includes manipulating a sequence length distribution, where manipulating the sequence length distribution follows a kurtosis rule, where the kurtosis rule defines the sequence length distribution to follow a geometric frequency distribution; generating the data based, at least in part, on the one or more redundancy parameters, where the data includes one or more variable custom-designed sequences, and where the data comprises one or more binary large objects exhibiting byte-sequence variability with binary large object dispersion, where the data include redundant spans that are specified as random seed generated variable length patterns from within a constrained number-space, and where the redundant spans are controlled, at least in part, by the redundancy parameters; and providing the data from the computer to a data de-duplicator, where manipulating the one or more redundancy parameters, manipulating the one or more parameters that control custom-designed sequences, generating the data, and providing the data are performed at least partially in parallel on the computer.
地址