发明名称 Optimized content search of files
摘要 A method, system, computer system, and computer-readable medium to search contents of a large number of files. Data are read sequentially from a storage device without using a file system. Physical location information for the files is obtained and used to construct files from the data read. Such physical location information can be obtained, for example, by accessing a file system mapping catalog without causing the file system to read the files. Accessing the mapping catalog can be performed quickly because only metadata is read from the mapping catalog. The constructed files can then be searched for content without the overhead of the file system. Content such as virus signatures and keywords can therefore be discovered much more quickly. Furthermore, because the device is read sequentially, storage locations belonging to more than one file are read only once, further improving the performance of the content search.
申请公布号 US8935281(B1) 申请公布日期 2015.01.13
申请号 US200511262567 申请日期 2005.10.31
申请人 Symantec Operating Corporation 发明人 Kale Sanjay Ramchandra;Nagarkar Kuldeep Sureshrao;Marode Abhay Harishchandra
分类号 G06F7/00;G06F17/30 主分类号 G06F7/00
代理机构 Campbell Stephenson LLP 代理人 Campbell Stephenson LLP
主权项 1. A method comprising: sequentially reading a set of contiguous storage locations of a storage device; obtaining physical location information for portions of a plurality of files stored in the set of contiguous storage locations on the storage device; storing at least one of the portions corresponding to a first file, in response at least in part to detecting that the first file is incomplete, wherein the detecting is based at least in part upon the physical location information, and the storing is performed subsequent to the sequentially reading the set of contiguous storage locations; searching a second file of the plurality of files for a pattern, in response at least in part to detecting that the second file is complete, wherein the detecting is based at least in part upon the physical location information, and the searching the second file is performed subsequent to the sequentially reading the set of contiguous storage locations; sequentially reading a second set of contiguous storage locations, wherein the sequentially reading the second set of contiguous storage locations is performed subsequent to the storing the at least one of the portions and subsequent to the searching the second file, and the second set of contiguous storage locations store second portions of one or more of the plurality of files; constructing the first file from the stored at least one of the portions and at least one of the second portions, wherein the constructing is based at least in part upon the physical location information; and searching the first file for the pattern, wherein the searching the first file is performed in response at least in part to the constructing the first file.
地址 Mountain View CA US
您可能感兴趣的专利