发明名称 TECHNIQUES FOR COMPACT DATA STORAGE OF NETWORK TRAFFIC AND EFFICIENT SEARCH THEREOF
摘要 In networked communication systems, a document in a communication (e.g., a response) may be similar between multiple communications involving the same resource, such that duplicate data can be discarded and not stored by a network storage system. Storage of differences in network traffic facilitates compression of storage of network traffic, thereby significantly reducing data storage. Techniques are disclosed for efficient search and retrieval of the compressed data storage. Network traffic may be compared to communications in previous network traffic to identify differences if any. Resource templates may be generated for different (e.g., new) resources identified in network traffic. Storage of the different resources identified in network traffic enables compression of network traffic. Similarity matching may be implemented to improve processing performance for compact storage of network traffic, including determining differences in network traffic for storage.
申请公布号 US2016226976(A1) 申请公布日期 2016.08.04
申请号 US201615009488 申请日期 2016.01.28
申请人 Quantum Metric, LLC 发明人 Ciabarra, JR. Mario Luciano;Wang Yiduo
分类号 H04L29/08;H04L29/06 主分类号 H04L29/08
代理机构 代理人
主权项 1. A computer-implemented method for compact storage of network communication, the method comprising: receiving, by a computer system, one or more data packets comprising a communication transmitted by a server computer, the communication including a resource requested by a client computer system; parsing, by the computer system, based on one or more delimiters, the requested resource to identify a plurality of data items in the requested resource; generating, by the computer system, a first set of hash values for the plurality of data items, wherein the first set of hash values is generated based on applying one or more hashing algorithms to the plurality of data items; retrieving, by the computer system, one or more stored templates, each of the one or more stored templates including different content; determining, by the computer system, a second set of hash values for each of the one or more stored templates; for each stored template of the one or more stored templates, performing, by the computer system, a comparison of the first set of hash values to the second set of hash values corresponding to each stored template; computing, by the computer system, a similarity value based on the comparison; upon determining that the similarity value indicates that the first set of hash values is not similar to the second set of hash values for a first stored template, generating, by the computer system, an edit log using the plurality of data items and the first stored template, wherein the edit log identifies differences between the plurality of data items of the requested resource and the first stored template; and storing, by the computer system, the edit log in a data store.
地址 Monument CO US