发明名称 ARITHMETIC CONTROL APPARATUS, ARITHMETIC CONTROL METHOD, NON-TRANSITORY COMPUTER READABLE MEDIUM STORING PROGRAM, AND OPEN CL DEVICE
摘要 When executing a first kernel and a second kernel related to each other by the arithmetic unit, if an allocation attribute of a continuous write block of the first kernel and an allocation attribute of a continuous read block corresponding to the continuous write block of the second kernel are the same, a scenario determination unit executes the first kernel and the second kernel in a pipeline by using the continuous write block for execution of the second kernel through the private memory or the local memory without transferring it to the global memory. At this time, the scenario determination unit logically adds a margin attribute and a dependence attribute of the continuous read block of the second kernel respectively to a margin attribute and a dependence attribute set for the read block for each of the read block of the first kernel.
申请公布号 US2015227466(A1) 申请公布日期 2015.08.13
申请号 US201514600959 申请日期 2015.01.20
申请人 Renesas Electronics Corporation 发明人 Kyo Shorin
分类号 G06F12/08 主分类号 G06F12/08
代理机构 代理人
主权项 1. An arithmetic control apparatus that controls parallel processing by a plurality of processing elements of an OpenCL (Open Computing Language) device including the plurality of processing elements and a plurality of memories in different hierarchies provided for the plurality of processing elements, comprising: an attribute group storing unit that acquires and stores an attribute group set for each of a read block being one or more data blocks stored in a memory in a lowest hierarchy among the plurality of memories and having data transferred to a memory in a different hierarchy for the parallel processing and a write block being one or more data blocks transferred from the memory in the different hierarchy to the memory in the lowest hierarchy after the parallel processing as a result of the parallel processing on the one or more read blocks; and a scenario determination unit that determines a transfer method of each of the read block and the write block based on each of the attribute group stored in the attribute group storing unit and a configuration parameter indicating a configuration of the OpenCL device, and performs transfer of the read block and the write block according to the determined transfer method and control of the parallel processing corresponding to the transfer, wherein the attribute group includes a plurality of attributes required for determining the transfer method and not depending on the configuration of the OpenCL device, including: an allocation attribute indicating whether to segment the data block into a plurality of sub-blocks and transfer the sub-blocks and indicating a segmentation method when segmenting the data block,a margin attribute indicating a size of data neighboring the sub-blocks transferred together with the sub-blocks when segmenting the data block into a plurality of sub-blocks and transferring the sub-blocks, anda dependence attribute indicating whether the sub-blocks have dependence with other neighboring sub-blocks when segmenting the data block into a plurality of sub-blocks and transferring the sub-blocks and indicating all dependence directions when there is the dependence, the attribute group of the write block is set based on an assumption that the write block already exists in the memory in the different hierarchy and is transferred to the memory in the lowest hierarchy, when a first kernel and a second kernel are executed in succession in the OpenCL device, the write block of parallel processing corresponding to the first kernel includes a continuous write block used as the read block of parallel processing corresponding to the second kernel, and the allocation attribute of the continuous write block set for the first kernel and the allocation attribute of the read block corresponding to the continuous write block set for the second kernel are the same, the scenario determination unit performs pipelining control that executes the first kernel and the second kernel in a pipeline by using the continuous write block for execution of the second kernel through the memory in the different hierarchy without transferring the continuous write block to the lowest hierarchy, and in the pipeline control, the scenario determination unit logically adds the margin attribute and the dependence attribute of the read block corresponding to the continuous write block set for the second kernel respectively to the margin attribute and the dependence attribute set for the read block for each of the read block corresponding to the first kernel.
地址 Kawasaki-shi JP