Date of this Version
The speed gap between processors and DRAM remains a crit-ical performance bottleneck for contemporary computer systems, which necessitates an effective management of last level caches (LLC) to minimize expensive off-chip accesses. However, because all sets in a conventional set-associative cache design are statically assigned an equal number of blocks, the LLC capacity utilization can drastically diminish when the cache actually exhibits non-uniform capacity demands across the sets. To reveal the wide exis-tence of set-level non-uniformity of capacity demand in real appli-cations, this technical report first establishes an accurate metric for measuring individual sets’ capacity demands by developing a group of mathematical models. Then, the report presents a last-level cache design1 called COSET (COoperative SET) L2 cache that identifies the capacity needs of individual sets based on the new metric, dynamically couples two sets with complementary capacity demands, and enables the set with a higher resource de-mand to utilize the capacity of its coupled set to reduce conflict misses. Our simulation study on 6 selected SPEC CPU 2000 benchmarks shows that the COSET L2 cache achieves a MPKI of as low as 0.383 and 0.781 on average normalized to the standard LRU cache, outperforming the state-of-the-art approach SBC that has the best and average performance results of 0.585 and 0.867 respectively.