发明名称 REDUCING INPUT/OUTPUT (I/O) OPERATIONS FOR CENTRALIZED BACKUP AND STORAGE
摘要 Techniques are described for reducing I/O operations and storage capacity requirements for centralized backup storage systems. A central server optimizes the collection and centralization of backup data from a number of endpoint devices for backup purposes. The central server utilizes a single instance store and a persistent files cache to minimize the number of backup copies for each non-unique file, reduce storage usage, network traffic, memory footprint and CPU cycles required to identify and process non-unique data. For each file in the single instance store, the server tracks the source device of that file until a threshold number of devices have been reached. Once the file reaches the threshold number of sources, the file is marked as persistent and its hash value is placed in the persistent files cache. Thereafter, all pointer creation and reference counting for that file cease.
申请公布号 US2016188229(A1) 申请公布日期 2016.06.30
申请号 US201414583477 申请日期 2014.12.26
申请人 VMware, Inc. 发明人 Rabinovich Dmitry;Genah Meytal;Gartsbein Anton
分类号 G06F3/06 主分类号 G06F3/06
代理机构 代理人
主权项 1. A method for reducing input/output (I/O) operations for centralized data storage, the method comprising: receiving, from each of a plurality of endpoint devices to a central server, a manifest that identifies a listing of files located on said each endpoint device; for each received manifest, inspecting the manifest received to the central server to determine which files need to be uploaded from the endpoint device to the central server in order to construct a full image of the endpoint device on the central server, the inspecting performed by: for each file identified in the manifest received from an endpoint device, determining whether the file is available on the central server and requesting the file if the file is not available on the central server;if the file is available on the central server, determining whether the file has been marked as persistent;if the file has not been marked as persistent, storing an indication that the endpoint device is a source of the file; anddetermining whether the file has at least a threshold number of source endpoint devices and marking the file as persistent if the file has at least the threshold number of source endpoint devices.
地址 Palo Alto CA US