发明名称 INTEGRITY CHECKING AND SELECTIVE DEDUPLICATION BASED ON NETWORK PARAMETERS
摘要 An approach for managing a data package is provided. Network throughput is determined to exceed a threshold. A sender computer determines a hash digest of the data package by using a hash function selected based on central processing unit utilization. If the hash digest is in a sender hash table, then without sending the data package, the sender computer sends the hash digest and an index referring to the hash digest so that a recipient computer can use the index to locate a matching hash digest and the data package in a recipient hash table. If the hash digest is not in the sender hash table, then the sender computer adds the data package and the hash digest to the sender hash table and sends the data package and the hash digest to the second computer to check the integrity of the data package based on the hash digest.
申请公布号 US2015154244(A1) 申请公布日期 2015.06.04
申请号 US201514589264 申请日期 2015.01.05
申请人 International Business Machines Corporation 发明人 Haustein Nils;Seipp Harald;Troppens Ulf;Winarski Daniel J.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method of managing a data package, the method comprising the steps of: a first computer determining a measurement of throughput through a network is greater than a throughput threshold, the network including the first computer and a second computer; the first computer determining whether a utilization of a central processing unit (CPU) included in the first computer is greater than a CPU utilization threshold; and based on the measurement of throughput being greater than the throughput threshold, the first computer entering a deduplication mode and subsequently performing steps in the deduplication mode that include: the first computer sending a notification to the second computer that the first computer has entered the deduplication mode;the first computer selecting a first hash function instead of a second hash function if the utilization of the CPU is greater than the CPU utilization threshold, and selecting the second hash function instead of the first hash function if the utilization of the CPU is less than or equal to the CPU utilization threshold;the first computer determining a hash digest of the data package by utilizing the selected first or second hash function;the first computer determining whether the hash digest is in a sender hash table coupled to the first computer;if the hash digest is in the sender hash table, then without the first computer sending the data package to the second computer, the first computer sending to the second computer the hash digest, an index referring to the hash digest in the sender hash table and in a recipient hash table coupled to the second computer, and optionally an identifier of the selected first or second hash function; andif the hash digest is not in the sender hash table, then the first computer adding the data package and the hash digest to the sender hash table and sending the data package, the hash digest, and the identifier of the selected first or second hash function to the second computer to determine whether the data package has integrity based on the hash digest.
地址 Armonk NY US