发明名称 Multistage collector for outputs in multiprocessor systems
摘要 Aspects include a multistage collector to receive outputs from plural processing elements. Processing elements may comprise (each or collectively) a plurality of clusters, with one or more ALUs that may perform SIMD operations on a data vector and produce outputs according to the instruction stream being used to configure the ALU(s). The multistage collector includes substituent components each with at least one input queue, a memory, a packing unit, and an output queue; these components can be sized to process groups of input elements of a given size, and can have multiple input queues and a single output queue. Some components couple to receive outputs from the ALUs and others receive outputs from other components. Ultimately, the multistage collector can output groupings of input elements. Each grouping of elements (e.g., at input queues, or stored in the memories of component) can be formed based on matching of index elements.
申请公布号 US9595074(B2) 申请公布日期 2017.03.14
申请号 US201213611325 申请日期 2012.09.12
申请人 Imagination Technologies Limited 发明人 McCombe James Alexander;Clohset Steven John;Redgrave Jason Rupert;Peterson Luke Tilman
分类号 G06F15/80;G06T1/20;G06T15/06 主分类号 G06F15/80
代理机构 Vorys, Sater, Seymour and Pease LLP 代理人 Vorys, Sater, Seymour and Pease LLP ;DeLuca Vincent M
主权项 1. A method of increasing processing throughput in a multiprocessor system having a plurality of computation units each processing different computation tasks asynchronously, comprising: asynchronously receiving outputs from a plurality of said computation units of the multiprocessor system, each of the outputs comprising an index element and one or more constituent elements associated with that index element, wherein an index element describes a computation task to be performed for the one or more constituent elements; grouping at least some constituent elements of said asynchronously received outputs into packets by comparing their respective index elements and grouping into individual packets those constituent elements associated with matching index elements, said individual packets being associated with respective matching index elements and having a first predetermined size; grouping at least some individual packets of constituent elements into larger individual packets until packets of a predetermined maximum size have been assembled by comparing respective index elements of packets of similar size and grouping packets associated with matching index elements into said larger individual packets; and outputting individual packets of said maximum size as output packets, whereby processing throughput of output packets by said multiprocessor system is increased.
地址 Kings Langley GB