摘要 |
A pipelined x86 processor includes a prefetch unit (prefetch buffer) and a branch unit that cooperate to detect when the target of a branch (designated a short branch) is already in the prefetch buffer, thereby avoiding issuing a prefetch request to retrieve the target. The branch unit includes a branch target cache (BTC) in which each entry stores, in addition to target address information for prefetching a prefetch block of instruction bytes containing a target instruction, a prefetch block location field-when this field is valid, it provides the location of the target instruction for a short branch within a prefetch block that is already in the prefetch buffer. In response to a branch that hits in the BTC, if the associated prefetch block location field is valid, the prefetch unit is able to begin transferring instruction bytes for the target instruction without issuing a prefetch request for the prefetch block containing the target instruction. The exemplary prefetch unit uses a three-block prefetch buffer each storing a 16 byte (cache line) prefetch block-the three prefetch buffers are logically allocated for the current, next, and previous prefetch blocks, and the target of a short branch may be either forward or backward of the branch, and may reside in the same prefetch buffer as the branch (which logically will be current) or in a contiguous prefetch buffer (logically next or previous). Avoiding prefetch requests in the case of short branches reduces contention for cache access and associated bus traffic. |