摘要 |
An arithmetic processing apparatus includes a plurality of arithmetic cores configured to execute threads in parallel, and a control unit configured to cause the arithmetic core to execute a reduction operation for data of the threads having the same storage area to which data is written per a predetermined number of threads in order to add data obtained by the reduction operation to data within a corresponding storage area by an atomic process. |