发明名称 Algorithm for vectorization and memory coalescing during compiling
摘要 One embodiment of the present invention sets forth a technique for reducing the number of assembly instructions included in a computer program. The technique involves receiving a directed acyclic graph (DAG) that includes a plurality of nodes, where each node includes an assembly instruction of the computer program, hierarchically parsing the plurality of nodes to identify at least two assembly instructions that are vectorizable and can be replaced by a single vectorized assembly instruction, and replacing the at least two assembly instructions with the single vectorized assembly instruction.
申请公布号 US9639336(B2) 申请公布日期 2017.05.02
申请号 US201213660986 申请日期 2012.10.25
申请人 NVIDIA Corporation 发明人 Grover Vinod;Kudlur Manjunath;Murphy Michael
分类号 G06F9/44;G06F9/45;G06F9/50 主分类号 G06F9/44
代理机构 Artegis Law Group, LLP 代理人 Artegis Law Group, LLP
主权项 1. A processor-implemented method for reducing the number of assembly instructions included in a computer program, comprising: receiving a directed acyclic graph (DAG) that includes a plurality of nodes, wherein each node includes an assembly instruction of the computer program; identifying nodes from within a first set that includes the plurality of nodes and that is deemed to be an unprocessed set that do not have any predecessors; moving the identified nodes into a second set; grouping the nodes included in the second set into different groups of nodes based on a type of assembly instruction included in each node; for a first group of nodes included in the different groups of nodes, identifying at least two assembly instructions corresponding to two or more nodes included in the first group of nodes that are vectorizable and can be replaced by a first single vectorized assembly instruction; for a second group of nodes included in the different groups of nodes, identifying at least two assembly instructions corresponding to two or more nodes in the second group of nodes that are vectorizable and can be replaced by a second single vectorized assembly instruction; replacing the at least two assembly instructions associated with the first group of nodes with the first single vectorized assembly instruction; and replacing the at least two assembly instructions associated with the second group of nodes with the second single vectorized assembly instruction.
地址 Santa Clara CA US