发明名称 LOAD AND STORE ORDERING FOR A STRONGLY ORDERED SIMULTANEOUS MULTITHREADING CORE
摘要 A mechanism for simultaneous multithreading is provided. Responsive to performing a store instruction for a given thread of threads on a processor core and responsive to the core having ownership of a cache line in a cache, an entry of the store instruction is placed in a given store queue belonging to the given thread. The entry for the store instruction has a starting memory address and an ending memory address on the cache line. The starting memory addresses through ending memory addresses of load queues of the threads are compared on a byte-per-byte basis against the starting through ending memory address of the store instruction. Responsive to one memory address byte in the starting through ending memory addresses in the load queues overlapping with a memory address byte in the starting through ending memory address of the store instruction, the threads having the one memory address byte is flushed.
申请公布号 US2016103681(A1) 申请公布日期 2016.04.14
申请号 US201414511408 申请日期 2014.10.10
申请人 International Business Machines Corporation 发明人 Alexander Khary J.;Hsieh Jonathan T.;Jacobi Christian;Recktenwald Martin
分类号 G06F9/30;G06F12/08 主分类号 G06F9/30
代理机构 代理人
主权项 1. A system for simultaneous multithreading (SMT), the system comprising: a cache; and a processor core comprising circuitry to execute threads by SMT, each of the threads having its own load queue and store queue, wherein in response to performing a store instruction for a given thread of the threads on the processor core and in response to the processor core having ownership of a cache line in the cache, the processor core is configured to execute the store instruction comprising: placing an entry of the store instruction in a given store queue belonging to the given thread, the entry for the store instruction having a starting memory address and an ending memory address on the cache line; comparing starting memory addresses through ending memory addresses of load queues of the threads on a byte-per-byte basis against the starting memory address through the ending memory address of the store instruction; in response to at least one memory address byte in the starting through ending memory addresses in the load queues of the threads overlapping with a memory address byte in the starting through ending memory address of the store instruction, flushing one or more of the threads having the at least one memory address byte; and in response to no overlap, allowing entries in the load queues of the threads to remain.
地址 Armonk NY US