发明名称 FULL EXPLOITATION OF PARALLEL PROCESSORS FOR DATA PROCESSING
摘要 Exemplary method, system, and computer program product embodiments for full exploitation of parallel processors for data processing are provided. In one embodiment, by way of example only, a set of parallel processors is partitioned into disjoint subsets according to indices of the set of the parallel processors. The size of each of the disjoint subsets corresponds to a number of processors assigned to the processing of the data chunks at one of the layers. Each of the processors are assigned to different layers in different data chunks such that each of processors are busy and the data chunks are fully processed within a number of the time steps equal to the number of the layers. A transition function is devised from the indices of the set of the parallel processors at one time steps to the indices of the set of the parallel processors at a following time step.
申请公布号 US2015234685(A1) 申请公布日期 2015.08.20
申请号 US201514623919 申请日期 2015.02.17
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 HIRSCH Michael;KLEIN Shmuel T.;TOAFF Yair
分类号 G06F9/50;G06F17/30;G06F9/48 主分类号 G06F9/50
代理机构 代理人
主权项 1. A method for full exploitation, after a plurality of initialization steps, of a set of a plurality of parallel processors to perform a task on a sequence of data chunks by a processor device in a computing environment, wherein each of the data chunks are processed in several time steps and by a plurality of layers with the plurality of layers being dealt with by at least one of a plurality of processors at each of the time steps, the method comprising: partitioning the set of the plurality of parallel processors into disjoint subsets according to indices of the set of the plurality of parallel processors such that wherein, the plurality of parallel processors are partitioned in accordance with one or more of a plurality of constraints;the number of the plurality of parallel processors that are available is n=2d−1, where d is the number of the plurality of layers, n is the number of the plurality of parallel processors, and 2d−1 is also equal to n=(2d)−1 (from claim 6), anda size of each of the disjoint subsets corresponds to a number of the plurality of processors assigned to the processing of the data chunks at one of the plurality of layers; partitioning the task into the plurality of layers independent of partitioning the set of the plurality of parallel processors; assigning each of the plurality of processors to the plurality of layers of the task according to the partitioning of the task such that each of the plurality of processors are busy and each of the data chunks are fully processed within a number of the time steps equal to the number of the plurality of layers, wherein the number of the plurality of parallel processors assigned to the processing of the data chunks at one of the plurality of layers is smaller than the number of the plurality of parallel processors assigned to the processing of the data chunks at a preceding one of the plurality of layers; and selecting and using one of a plurality of constraints for restricting any one of the plurality of parallel processors to always work on the same one of the plurality of layers or, for each of the plurality of layers except for a first layer, to always work on a same data chunk as in each of the previous layers.
地址 Armonk NY US