发明名称 |
Managing storage of individually accessible data units |
摘要 |
Managing data by: receiving a group of individually accessible data units, each data unit identified by a key value, with key values determined such that the key value identifying a first data unit received before a second data unit occurs earlier in a sort order than the key value identifying the second data unit; and processing the data units for storage in a data storage system. The processing includes: storing blocks of data, the blocks being generated by combining a plurality of the data units; providing an index with entries that enable location, based on a provided key value, of a block that includes a data unit corresponding to the provided key value; and generating one or more screening data structures associated with the blocks for determining, based on a given key value, whether to search the stored blocks for a data unit corresponding to the given key value. |
申请公布号 |
US8949189(B2) |
申请公布日期 |
2015.02.03 |
申请号 |
US201313942277 |
申请日期 |
2013.07.15 |
申请人 |
Ab Initio Technology LLC |
发明人 |
Kulkarni Vrishal;Schmidt Stephen;Stanfill Craig W.;Vishniac Ephraim Meriwether |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Fish & Richardson P.C. |
代理人 |
Fish & Richardson P.C. |
主权项 |
1. A method for managing data, the method including:
receiving at least one group of individually accessible data units over an input device or port, each data unit identified by a key value, with key values of received data units being determined such that the key value identifying a given first data unit that is received before a given second data unit occurs earlier in a sort order than the key value identifying the given second data unit; and processing, by at least one processor, the received data units for storage in a data storage system, the processing including storing a plurality of blocks of data, one or more of the blocks being generated by combining a plurality of the received data units; providing an index that includes an entry for each of the blocks, wherein one or more of the entries enable location, based on a provided key value, of a block that includes a data unit corresponding to the provided key value; and generating one or more screening data structures associated with the stored blocks for determining, based on a given key value and one or more of the screening data structures, whether to search the stored blocks for a data unit that corresponds to the given key value; wherein generating the one or more screening data structures is based on a user-defined probability that a screening data structure correctly or incorrectly identifies a stored block as the location of a data unit. |
地址 |
Lexington MA US |