发明名称 ADAPTIVE DISTRIBUTION METHOD FOR HASH OPERATIONS
摘要 A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
申请公布号 US2015234896(A1) 申请公布日期 2015.08.20
申请号 US201514626836 申请日期 2015.02.19
申请人 Snowflake Computing Inc. 发明人 Dageville Benoit;Cruanes Thierry;Zukowski Marcin;Lee Allison Waingold;Unterbrunner Philipp Thomas
分类号 G06F17/30;H04L29/08 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method for joining relations distributed over a computer network and associated via communication links and processing nodes, the method comprising: receiving a relational join query for a join operation comprising a predicate and a plurality of relations wherein the desired join uses an equivalence operation; placing communication links between a build operation and a probe operation that are inactive and are in an adaptive state; placing communication links between a first relation and the probe operation that are inactive and are in an adaptive state; placing communication links between a second relation in a partition state such that any tuples of the second relation are forwarded to the build operation; repeating the build operation until the second relation is fully consumed and forwarded to the build operation such that an actual size of the second relation is known after being fully consumed; and determining whether to join the relations via a broadcasting join or a re-portioning join based primarily on the actual size of the second relation, an estimated size of the first relation, and a cost metric.
地址 San Mateo CA US