发明名称 Efficient implementation of arrays of structures on SIMT and SIMD architectures
摘要 One embodiment of the present invention sets forth a technique providing an optimized way to allocate and access memory across a plurality of thread/data lanes. Specifically, the device driver receives an instruction targeted to a memory set up as an array of structures of arrays. The device driver computes an address within the memory using information about the number of thread/data lanes and parameters from the instruction itself. The result is a memory allocation and access approach where the device driver properly computes the target address in the memory. Advantageously, processing efficiency is improved where memory in a parallel processing subsystem is internally stored and accessed as an array of structures of arrays, proportional to the SIMT/SIMD group width (the number of threads or lanes per execution group).
申请公布号 US8751771(B2) 申请公布日期 2014.06.10
申请号 US201113247855 申请日期 2011.09.28
申请人 NVIDIA Corporation 发明人 Fahs Brian;Nickolls John R.;Moreton Henry Packard;Coon Brett W.
分类号 G06F12/00;G06F13/00;G06F13/28;G06F9/26;G06F9/34;G06F9/38;G06F9/30 主分类号 G06F12/00
代理机构 代理人
主权项 1. A computer-implemented method for accessing data in a data structure stored in a memory, the method comprising: receiving a memory access instruction that includes a base address of the data structure that corresponds to a first memory location within the memory; computing a first partial offset relative to the base address that is proportional to a position of a target structure of arrays within the data structure; computing a second partial offset by adding to the first partial offset a position of a target structure within the target structure of arrays; computing a third partial offset by adding to the second partial offset a position of a target field within the target structure; and accessing a location within the memory corresponding to the base address plus the third partial offset.
地址 Santa Clara CA US