主权项 |
1. A method for full exploitation, after a plurality of initialization steps, of a set of a plurality of parallel processors to perform a task on a sequence of data chunks by a processor device in a computing environment, wherein each of the data chunks are processed in several time steps and by a plurality of layers with the plurality of layers being dealt with by at least one of a plurality of processors at each of the time steps, the method comprising:
partitioning the set of the plurality of parallel processors into disjoint subsets according to indices of the set of the plurality of parallel processors such that wherein,
the plurality of parallel processors are partitioned in accordance with one or more of a plurality of constraints;the number of the plurality of parallel processors that are available is n=2d−1, where d is the number of the plurality of layers, n is the number of the plurality of parallel processors, and 2d−1 is also equal to n=(2d)−1, anda size of each of the disjoint subsets corresponds to a number of the plurality of processors assigned to the processing of the data chunks at one of the plurality of layers; partitioning the task into the plurality of layers independent of partitioning the set of the plurality of parallel processors; assigning each of the plurality of processors to the plurality of layers of the task according to the partitioning of the task such that each of the plurality of processors are busy and each of the data chunks are fully processed within a number of the time steps equal to the number of the plurality of layers, wherein the number of the plurality of parallel processors assigned to the processing of the data chunks at one of the plurality of layers is smaller than the number of the plurality of parallel processors assigned to the processing of the data chunks at a preceding one of the plurality of layers; and selecting and using one of a plurality of constraints for restricting any one of the plurality of parallel processors to always work on the same one of the plurality of layers or, for each of the plurality of layers except for a first layer, to always work on a same data chunk as in each of the previous layers; wherein
the number of the plurality of parallel processors that are available is n=2d−1, where d is the number of the plurality of layers and n is the number of the plurality of parallel processors and 2d−1 is also equal to n=(2d)−1,the number of the plurality of processors assigned to level 0 is n=2d−1, andthe number of the plurality of processors assigned to the processing of the data chunks at one of the plurality of layers which is not the first is half of the number of the plurality of processors assigned to the processing of the data chunks at the previous one of the plurality of layers. |