发明名称 Converting a data placement between memory banks and an array processing section
摘要 In an array processing section, using data strings entered from input ports, a plurality of data processor elements execute predetermined operations while transferring data to each other, and output data strings of results of the operations from a plurality of output ports. A first data string converter converts data strings stored in a plurality of data storages of a data storage group into a placement suitable for the operations in the array processing section, and enters the converted data strings into the input ports of the array processing section. A second data string converter converts the data strings output from output ports of the array processing section into a placement to be stored in the plurality of data storages of the data storage group.
申请公布号 US9424230(B2) 申请公布日期 2016.08.23
申请号 US200812594757 申请日期 2008.02.22
申请人 NEC CORPORATION 发明人 Kobori Tomoyoshi;Seki Katsutoshi
分类号 G06F15/173;G06F15/80;G06T1/20 主分类号 G06F15/173
代理机构 代理人
主权项 1. An array processor type data processing apparatus comprising: a data storage group comprising a plurality of data storages that store data strings respectively therein; an array processing section having a plurality of data processor elements arranged in an array of P×T, P and T being natural numbers equal to or larger than two, a plurality of input ports whose number is P, and a plurality of output ports whose number is P, wherein using data strings, whose number is T, entered from each of said plurality of input ports, said plurality of data processor elements execute predetermined operations while transferring data to each other, and output data strings of results of the operations from said plurality of output ports; a first data string converter that converts data strings stored in said plurality of data storages of said data storage group into a P×T placement suitable for the operations in said array processing section, and enters the converted data strings into said plurality of input ports of said array processing section; and a second data string converter that converts the data strings output from said plurality of output ports of said array processing section into a T×P placement to be stored in said plurality of data storages of said data storage group; wherein said array processing section is multithreaded so as to process a plurality of threads, each serving as an independent processing unit continuously by time-division multiplexing, and outputs results of operations on the threads which have been processed by time-division multiplexing in an order in which the threads are processed; said data storage group stores data sequences, which are data strings for the threads, as units of said threads altogether in said data storages, respectively; said first data string converter and said second data string converter have resources to convert the placement of the data strings, shared by the threads; said first data string converter converts the placement of data such that data entered in a plurality of cycles into one input port of the first data string converter are output in one cycle from a plurality of output ports of the first data string converter, and data entered in one cycle into a plurality of input ports of the first data string converter are output in a plurality of cycles from one output port of the first data string converter, thereby converting the placement of the data strings stored in the plurality of data storages of said data storage group into the P×T placement suitable for the operations in said array processing section; and said second data string converter converts the placement of data such that data entered in one cycle into a plurality of input ports of the second data string converter are output in a plurality of cycles from one output port of the second data string converter, and data entered in a plurality of cycles into one input port of the second data string converter are output in one cycle from a plurality of output ports of the second data string converter, thereby converting the placement of the data strings output from said plurality of output ports of said array processing section into the T×P placement to be stored in said plurality of data storages, wherein T is a number of said plurality of threads, and P is equal to T, wherein the P×T placement is a P×T array, the T×P placement is a T×P array, wherein in each of a plurality of P clock cycles, a successive column of the T×P array is input to said first data string converter, wherein in each of a plurality of T clock cycles beginning immediately after the plurality of P clock cycles, a successive row of the T×P array is output by said first data string converter as a corresponding successive column of the P×T array.
地址 Tokyo JP