主权项 |
1. A method comprising:
implementing a hierarchical file system that supports modifications using chunks stored in a content addressable storage system, wherein at least a portion of the chunks represents file content; wherein making a modification in the hierarchical file system comprises storing one or more new chunks in the content addressable storage system that represent the hierarchical file system structure after the modification; performing a garbage collection operation that reclaims stale chunks, within the content addressable storage system, that no longer reflect a current state of the hierarchical file system as of a first particular time due to one or more modifications in the hierarchical file system; wherein performing the garbage collection operation includes:
during a trace phase of the garbage collection operation, performing one or more recursive touch-traversal operations to establish new last-touched timestamps for chunks visited during the recursive touch-traversal operations;wherein each of the one or more recursive touch-traversal operations is initiated at a distinct chunk that, as of a second particular time that is later than the first particular time, was a current root of a corresponding tree within the hierarchical file system;during the trace phase, determining whether to initiate a particular recursive touch-traversal operation at a particular chunk, which was the current root of a particular tree within the hierarchical file system as of the second particular time, based, at least in part, on a sub-tree-timestamp maintained for the particular tree;responsive to the sub-tree-timestamp of the particular tree being before the first particular time, initiating the particular recursive touch-traversal operation at the particular chunk;responsive to the sub-tree-timestamp of the particular tree being after the first particular time, performing the trace phase without initiating any recursive touch-traversal operation at the particular chunk;during a sweep phase, reclaiming chunks, within the content addressable storage system, that have last-touched timestamps that are before the first particular time; wherein the method is performed by one or more computing devices. |