发明名称 Techniques for dynamic partitioning in a distributed parallel computational environment
摘要 An apparatus includes an organization component to retrieve from task instructions an indication of a type of organization of data set subportions prior to performance of a computation and a data item by which the data set subportions are to be organized, organize the data set subportion among others based on the data item and type of organization, monitor availability of a first processing resource and a first storage resource of a node device employed to organize the data set subportions, and based on insufficient availability of at least one of the first processing resource or the first storage resource, interrupt the organization of the data set subportions, and dispatch a first set of one or more organized data set subportions to be processed; and a performance component to execute the task instructions to process the organized data set subportion.
申请公布号 US9298807(B1) 申请公布日期 2016.03.29
申请号 US201514845662 申请日期 2015.09.04
申请人 SAS INSTITUTE INC. 发明人 Zanter David Laverne
分类号 G06F17/30;H04L12/26;H04L12/721 主分类号 G06F17/30
代理机构 Kacvinsky Daisak Bluni PLLC 代理人 Kacvinsky Daisak Bluni PLLC
主权项 1. An apparatus comprising: a processor component; a network interface to couple the processor component to a network to receive task instructions to perform a computation with data set subportions within a data set portion as an input to the computation; an organization component for execution by the processor component to retrieve from the task instructions an indication of a type of organization required of the data set subportions prior to performance of the computation and a data item within each data set subportion by which the data set subportions are to be organized, wherein: the type of organization comprises at least one of ordering or grouping the data set subportions of the data set portion by the data item;for each data set subportion of the data set portion, the organization component is to: organize the data set subportion among others of the data set subportions within the data set portion based on the data item and the indicated type of organization; andmonitor an availability of a first processing resource and a first storage resource of a node device employed to organize the data set subportion; andbased on insufficient availability of at least one of the first processing resource or the first storage resource, the organization component is to interrupt the organization of the data set subportions, and dispatch a first set of one or more organized data set subportions to be processed; a performance component for execution by the processor component to, for each organized data set subportion of the first set, execute the task instructions to process the organized data set subportion; and a completion component for execution by the processor component to operate the network interface to transmit one or more processed data set subportions processed from the first set to another device via the network as part of aggregating the processed data set subportions with other processed data set subportions associated with another data set portion, and to trigger a return to organization of another data set subportion of the data set portion not yet organized by the organization component to generate a second set of one or more organized data set subportions.
地址 Cary NC US