发明名称 System, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations
摘要 A system, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations are provided. Rows allocated to processing modules involved in a join operation are redistributed among the processing modules by a hash redistribution of the join attributes. Receipt by a processing module of an excessive number of redistributed rows having a skewed value on the join attribute is detected by a processing module which notifies other processing modules of the skewed value. Processing modules then terminate redistribution of rows having a join attribute value matching the skewed value and either store such rows locally or duplicate the rows. The processing module that has received an excessive number of redistributed rows removes rows having a skewed value of the join attribute from a redistribution spool allocated thereto and duplicates the rows to each of the processing modules. The join operation is completed by performing a local join at each processing module and merging the results of the local join operations.
申请公布号 US8510280(B2) 申请公布日期 2013.08.13
申请号 US20090494366 申请日期 2009.06.30
申请人 XU YU;KOSTAMAA OLLI PEKKA;ZHOU XIN;TERADATA US, INC. 发明人 XU YU;KOSTAMAA OLLI PEKKA;ZHOU XIN
分类号 G06F17/00 主分类号 G06F17/00
代理机构 代理人
主权项
地址