发明名称 Method and system for parallelization of pipelined computations
摘要 A method of parallelizing a pipeline includes stages operable on a sequence of work items. The method includes allocating an amount of work for each work item, assigning at least one stage to each work item, partitioning the at least one stage into at least one team, partitioning the at least one team into at least one gang, and assigning the at least one team and the at least one gang to at least one processor. Processors, gangs, and teams are juxtaposed near one another to minimize communication losses.
申请公布号 US9110726(B2) 申请公布日期 2015.08.18
申请号 US200712513838 申请日期 2007.11.08
申请人 QUALCOMM Incorporated 发明人 Kotlyar Vladimir;Moudgill Mayan;Pogudin Yurly M.
分类号 G06F9/46;G06F15/00;G06F9/50;G06F9/38 主分类号 G06F9/46
代理机构 Knobbe Martens Olson & Bear LLP 代理人 Knobbe Martens Olson & Bear LLP
主权项 1. A method of parallelizing a pipeline configured to perform a computational process with a series of pipeline stages, the method comprising: defining a work item for performance by a single iteration across at least two stages comprising a first stage and a second stage, wherein the at least two stages are performed across at least three units of time comprising a current unit of time; assigning the second stage at the current unit of time to a team of physical processors, where two or more iterations are being performed by the second stage in parallel with another iteration being performed by the first stage at the current unit of time, where the team has a number of physical processors defined by a first number of physical processors, wherein the first number is two or more; partitioning the team into at least two gangs of physical processors, each gang having a number of physical processors defined by a second number of physical processors less than the first number of physical processors, wherein the at least two gangs are configured such that different gangs process different iterations assigned to the second stage in parallel with each other differently by receiving different iteration outputs from a previous stage and producing different iteration outputs at a future stage for different iterations at the future stage; and processing the different iterations at the second stage at the current unit of time, wherein the processing the different iterations at the second stage at the current unit of time is selected from the group consisting of processing different iterations using different physical processors, different parts of the same iteration using different physical processors and different parts of different iterations using different physical processors.
地址 San Diego CA US