发明名称 Method and system for converting a single-threaded software program into an application-specific supercomputer
摘要 The invention comprises (i) a compilation method for automatically converting a single-threaded software program into an application-specific supercomputer, and (ii) the supercomputer system structure generated as a result of applying this method. The compilation method comprises: (a) Converting an arbitrary code fragment from the application into customized hardware whose execution is functionally equivalent to the software execution of the code fragment; and (b) Generating interfaces on the hardware and software parts of the application, which (i) Perform a software-to-hardware program state transfer at the entries of the code fragment; (ii) Perform a hardware-to-software program state transfer at the exits of the code fragment; and (iii) Maintain memory coherence between the software and hardware memories. If the resulting hardware design is large, it is divided into partitions such that each partition can fit into a single chip. Then, a single union chip is created which can realize any of the partitions.
申请公布号 US2015317190(A1) 申请公布日期 2015.11.05
申请号 US201414581169 申请日期 2014.12.23
申请人 Ebcioglu Kemal;Kultursay Emre 发明人 Ebcioglu Kemal;Kultursay Emre
分类号 G06F9/52 主分类号 G06F9/52
代理机构 代理人
主权项 1. A general-purpose supercomputer for performing parallel execution of parallel software compiled from a code fragment within a single-threaded software application, where the general-purpose supercomputer comprises: a. a plurality of general-purpose processors; b. at least one task network connected to the plurality of general-purpose processors, allowing a first general-purpose processor on the task network to send a task invocation request to a second general-purpose processor on said task network, and to receive back from the second general-purpose processor either a task result message or a task completion acknowledgement; c. at least one hardware synchronization unit to ensure that if a memory instruction instance I2 is dependent on a memory instruction instance I1 in sequential execution of the code fragment, the memory instruction instance I2 is executed after the memory instruction instance I1 in the parallel execution of the parallel software performed by the general-purpose supercomputer; and d. at least one coherent memory hierarchy, which: (i) supports a plurality of load/store ports that are accessed by the plurality of general-purpose processors in parallel; and(ii) signals a completion of each memory instruction issued from each load/store port of the plurality of load/store ports, for supporting synchronization units; where parallel execution of the parallel software by the general-purpose supercomputer is functionally equivalent to sequential execution of the code fragment within the single-threaded software application; and where the general-purpose supercomputer is implemented as a plurality of copies of a union module implemented in ASIC technology, with scalable network connections, and where the union module implemented in ASIC technology is able to perform function of any of a plurality of modules resulting from partitioning a hardware design of the general-purpose supercomputer.
地址 Katonah NY US
您可能感兴趣的专利