发明名称 Method and apparatus for single instance indexing of backups
摘要 A method and apparatus for single instance indexing of backup images is provided. In one example, a content identifier is established for a file in the backup images. An index database associated with the backup images is queried with the content identifier. Content and metadata of the file is indexed if the content identifier is not in the index database. Only the metadata for the file is indexed if the content identifier is not in the index database. In one example, the content identifier comprises a file identifier defined by the metadata for the file. In another example, the content identifier comprises a checksum computed for the file.
申请公布号 US9342524(B1) 申请公布日期 2016.05.17
申请号 US200711704755 申请日期 2007.02.09
申请人 Veritas Technologies LLC 发明人 Doty Keith
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Campbell Stephenson LLP 代理人 Campbell Stephenson LLP
主权项 1. A method of indexing backup images in a computer network, comprising: creating the backup images in memory; establishing a first file identifier for a first file in a first of the backup images, wherein the first file identifier is established by combining attributes in file metadata and attributes in catalog data, andthe file metadata and the catalog data are maintained separately; computing a checksum for the first file; determining a backup type of the first backup image is set to full backup; in response to determining the backup type of the first backup image is set to full backup, querying a search index associated with the backup images with the first file identifier and with the checksum; determining whether the first file is duplicative of a file that has previously been indexed by the search index by calculating if there is a match of the first file identifier or the checksum in the search index based on the querying, wherein in response to not finding a match of the first file identifier or the checksum in the search index, determining that the first file is not duplicative of the file,including the checksum for the first file in the file metadata, andupdating the search index by adding the file metadata for the first file and content of the first file, andin response to finding a match of the first file identifier or the checksum in the search index, comparing file sizes of the first file and file,determining that the first file is duplicative of the file if the file sizes of the first file and the file are equal and if the match is of the checksum, andupdating the search index by adding the file metadata for the first file, but not content of the first file; and maintaining the first file in memory after updating the search index.
地址 Mountain View CA US