Electrical & Computer Engineering, Department of

 

Document Type

Article

Date of this Version

2014

Citation

IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 7, JULY 2014

Comments

© 2012 IEEE.

Abstract

Most chip-multiprocessors nowadays adopt a large shared last-level cache (SLLC). This paper is motivated by our analysis and evaluation of state-of-the-art cache management proposals which reveal a common weakness. That is, the existing alternative replacement policies and cache partitioning schemes, targeted at optimizing either locality or utility of co-scheduled threads, cannot deliver consistently the best performance under a variety of workloads. Therefore, we propose a novel adaptive scheme, called CLU, to interactively co-optimize the locality and utility of co-scheduled threads in thread-aware SLLC capacity management. CLU employs lightweight monitors to dynamically profile the LRU (least recently used) and BIP (bimodal insertion policy) hit curves of individual threads on runtime, enabling the scheme to co-optimize the locality and utility of concurrent threads and thus adapt to more diverse workloads than the existing approaches. We provide results from extensive execution-driven simulation experiments to demonstrate the feasibility and efficacy of CLU over the existing approaches (TADIP, NUCACHE, TA-DRRIP, UCP, and PIPP).

Share

COinS