发明名称 LOAD STORE BUFFER AGNOSTIC TO THREADS IMPLEMENTING FORWARDING FROM DIFFERENT THREADS BASED ON STORE SENIORITY
摘要 In a processor, a thread agnostic unified store queue and a unified load queue method for out of order loads in a memory consistency model using shared memory resources. The method includes implementing a memory resource that can be accessed by a plurality of asynchronous cores, wherein the plurality of cores share a unified store queue and a unified load queue; and implementing an access mask that functions by tracking which words of a cache line are accessed via a load, wherein the cache line includes the memory resource, wherein the load sets a mask bit within the access mask when accessing a word of the cache line, and wherein the mask bit blocks accesses from other loads from a plurality of cores. The method further includes checking the access mask upon execution of subsequent stores from the plurality of cores to the cache line, wherein stores from different threads can forward to loads of different threads while still maintaining in order memory consistency semantics; and causing a miss prediction when a subsequent store to the portion of the cache line sees a prior mark from a load in the access mask, wherein the subsequent store will signal a load queue entry corresponding to that load by using a tracker register and a thread ID register.
申请公布号 US2015205605(A1) 申请公布日期 2015.07.23
申请号 US201414569554 申请日期 2014.12.12
申请人 Soft Machines, Inc. 发明人 ABDALLAH Mohammad
分类号 G06F9/30;G06F12/14 主分类号 G06F9/30
代理机构 代理人
主权项 1. In a processor, a thread agnostic unified store queue and a unified load queue method for out of order loads in a memory consistency model using shared memory resources, comprising: implementing a memory resource that can be accessed by a plurality of asynchronous cores, wherein the plurality of cores share a unified store queue and a unified load queue; implementing an access mask that functions by tracking which words of a cache line are accessed via a load, wherein the cache line includes the memory resource, wherein the load sets a mask bit within the access mask when accessing a word of the cache line, and wherein the mask bit blocks accesses from other loads from a plurality of cores; checking the access mask upon execution of subsequent stores from the plurality of cores to the cache line, wherein stores from different threads can forward to loads of different threads while still maintaining in order memory consistency semantics; and causing a miss prediction when a subsequent store to the portion of the cache line sees a prior mark from a load in the access mask, wherein the subsequent store will signal a load queue entry corresponding to that load by using a tracker register and a thread ID register.
地址 Santa Clara CA US