发明名称 Efficient Distributed Query Execution
摘要 An embodiment of the invention provides a method wherein a database query including a first constraint and additional constraint(s) are received in a first node. Data in the first node that satisfies the first constraint is identified, encoded, and sent to a second node. Encoded data is identified in a mapping table in the second node; and, one or more missing identifiers are identified that include encoded data that is not in the mapping table. The missing identifier is sent to the first node, decoded to retrieve the value of the missing identifier, and mapped to the retrieved value. The mapping of the missing identifier and the retrieved value are sent to the second node. A dictionary in the second node is queried with the retrieved value to identify an identification number for the retrieved value. The missing identifier is mapped to the identification number for the retrieved value.
申请公布号 US2017083632(A1) 申请公布日期 2017.03.23
申请号 US201514858657 申请日期 2015.09.18
申请人 International Business Machines Corporation 发明人 Kotoulas Spyros;Sbodio Marco Luca;Stephenson Martin Joseph;Tommasi Pierpaolo
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for query execution in multiple nodes of a distributed database system, said method comprising: receiving a database query in a first node of the distributed database system, the database query including a first constraint and at least one additional constraint; identifying data in the first node that satisfies the first constraint with a first processor; encoding the data with an encoder to generate encoded data; sending the encoded data to a second node of the distributed database system with a first communications device; identifying at least one encoded data of the encoded data that is in a mapping table in the second node with a second processor; identifying at least one missing identifier with the second processor, the at least one missing identifier including at least one encoded data of the encoded data that is not in the mapping table in the second node; sending the missing identifier to the first node with a second communications device; decoding the missing identifier to retrieve the value of the missing identifier; mapping the missing identifier to the retrieved value; sending the mapping of the missing identifier and the retrieved value to the second node with the first communications device; querying a dictionary in the second node with the retrieved value to identify an identification number for the retrieved value; and mapping the missing identifier to the identification number for the retrieved value.
地址 Armonk NY US