发明名称 |
Systems and methods for providing increased scalability in deduplication storage systems |
摘要 |
A computer-implemented method for providing increased scalability in deduplication storage systems may include (1) identifying a database that stores a plurality of reference objects, (2) determining that at least one size-related characteristic of the database has reached a predetermined threshold, (3) partitioning the database into a plurality of sub-databases capable of being updated independent of one another, (4) identifying a request to perform an update operation that updates one or more reference objects stored within at least one sub-database, and then (5) performing the update operation on less than all of the sub-databases to avoid processing costs associated with performing the update operation on all of the sub-databases. Various other systems, methods, and computer-readable media are also disclosed. |
申请公布号 |
US8954401(B2) |
申请公布日期 |
2015.02.10 |
申请号 |
US201113007301 |
申请日期 |
2011.01.14 |
申请人 |
Symantec Corporation |
发明人 |
Zhang Xianbo;Guo Fanglu;Wu Weibao |
分类号 |
G06F7/00;G06F17/00;G06F17/30;G06F13/14;G06F11/14 |
主分类号 |
G06F7/00 |
代理机构 |
ALG Intellectual Property, LLC |
代理人 |
ALG Intellectual Property, LLC |
主权项 |
1. A computer-implemented method for providing increased scalability in deduplication storage systems, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
identifying a database that stores a plurality of reference objects, wherein each reference object within the database identifies both:
at least one unique file segment stored in a deduplication storage system; andfor each unique file segment, whether one or more backed-up files within the deduplication storage system currently reference the unique file segment; determining that the size of the entire database as a whole has reached a predetermined threshold; in response to determining that the size of the entire database as a whole has reached the predetermined threshold:
partitioning the database into a plurality of sub-databases capable of being updated independent of one another, the plurality of sub-databases comprising an inactive sub-database that is empty after the time of the partition;designating the inactive sub-database within the plurality of sub-databases as an active sub-database for storing reference objects created after the time of the designation; identifying a request to perform an update operation that updates one or more reference objects stored within the active sub-database; performing the update operation only on the active sub-database to avoid processing costs associated with performing the update operation on all of the sub-databases within the plurality of sub-databases, wherein performing the update operation comprises:
postponing performing the update operation until identifying a predetermined number of other requests to perform other update operations on the active sub-database;identifying other requests to perform other update operations on the active sub-database;determining that the number of the other requests identified has reached the predetermined number of other requests;sequentially performing the update operation and the other update operations on the active sub-database. |
地址 |
Mountain View CA US |