摘要 |
Dynamic warp subdivision (DWS), which allows a single warp to occupy more than one slot in the scheduler without requiring extra register file space, is described. Independent scheduling entities also allow divergent branch paths to interleave their execution, and allow threads that hit in the cache or otherwise have divergent memory-access latency to run ahead. The result is improved latency hiding and memory level parallelism (MLP).
|