发明名称 Distributing data for a distributed filesystem across multiple cloud storage systems
摘要 The disclosed embodiments provide a system that distributes data for a distributed filesystem across multiple cloud storage systems. Two or more cloud controllers collectively manage distributed filesystem data that is stored in one or more cloud storage systems; the cloud controllers cache and ensure data consistency for the stored data. Whenever each cloud controller receives new data from a client, it outputs an incremental metadata snapshot for the new data that is propagated to the other cloud controllers and an incremental data snapshot containing the new data that is sent to a cloud storage system. During operation, data stored in the distributed filesystem can be distributed across two or more cloud storage systems to optimize performance and/or cost for the distributed filesystem.
申请公布号 US8799413(B2) 申请公布日期 2014.08.05
申请号 US201213725738 申请日期 2012.12.21
申请人 Panzura, Inc. 发明人 Taylor John Richard;Chou Randy Yen-pang;Davis Andrew P.
分类号 G06F15/16 主分类号 G06F15/16
代理机构 Park, Vaughan, Fleming & Dowler LLP 代理人 Park, Vaughan, Fleming & Dowler LLP ;Spiller Mark D.
主权项 1. A computer-implemented method for distributing data for a distributed filesystem across multiple cloud storage systems, the method comprising: collectively managing the data of the distributed filesystem using two or more cloud controllers by: collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients can only access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in a first remote cloud storage system using cloud files, wherein each cloud controller caches a subset of the file data from the first remote cloud storage system that is being actively accessed by that cloud controller's respective clients, wherein all new file data received by each cloud controller from its clients is written to the first remote cloud storage system via the receiving cloud controller;maintaining at each cloud controller a copy of the complete metadata for all of the files stored in the distributed filesystem, wherein each cloud controller communicates any changes to the metadata for the distributed filesystem to the full set of cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of each file in the distributed filesystem;upon receiving in a cloud controller new file data from a client, storing the new file data for the distributed filesystem in the first remote cloud storage system, wherein the new file data includes two or more new files that are stored in a cloud file, wherein the cloud file is sent from the cloud controller to the first remote cloud storage system as part of an incremental data snapshot; andupon receiving confirmation that the cloud file has been successfully stored in the first remote cloud storage system, sending from the cloud controller an incremental metadata snapshot that includes new metadata for the distributed filesystem that describes the new file data and links to the cloud file, wherein the incremental metadata snapshot is received by the other cloud controllers of the distributed filesystem;and distributing data stored in the distributed filesystem across the first remote cloud storage system and a second remote cloud storage system to optimize at least one of performance and cost of the distributed filesystem.
地址 San Jose CA US