Off-campus UNL users: To download campus access dissertations, please use the following link to log into our proxy server with your NU ID and password. When you are done browsing please remember to return to this page and log out.
Non-UNL users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Spatiotemporal capacity management for the last level caches of chip multiprocessors
Judicious management of on-chip last-level caches (LLC) is critical to alleviating the memory wall of chip multiprocessors (CMP). Although there already exist many LLC management proposals, belonging to either the spatial or temporal dimension, they fail to capture and utilize the inherent interplays between the two dimensions in capacity management. Therefore, this dissertation is targeted at exploring and exploiting the spatiotemporal interactions in LLC capacity management to improve CMPs' performance. Based on this general idea, we address four specific research problems in the dissertation. For the private LLC organization, prior-art proposals can improve the efficacy of inter-core cooperative caching at the coarse-grained application level. However, they are still suboptimal because they are unable to take advantage of the diverse capacity demands at the fine-grained set level. We introduce the SNUG LLC design that exploits the set-level non-uniformity of capacity demands and thus further improves performance. Still for the private LLC management, we notice that neither spatial nor temporal LLC management schemes, working separately as in prior work, can deliver robust performance under various circumstances due to set-level non-uniform capacity demands. We propose a novel adaptive scheme, called STEM, to solve the problem by interactively managing both spatial and temporal dimensions of capacity demands at the set level. For the shared LLC organization, existing proposals try to improve either locality or utility for heterogeneous workloads. But we find that none of them can deliver consistently the best performance under a variety of workloads due to applications' diverse locality and utility features. To address the problem, we present the CLU LLC design that co-optimizes the locality & utility of co-scheduled threads and thus adapts to more diverse workloads than the prior-arts. To make a cache management strategy practical for industry, we will need to cut the overhead of the re-reference prediction value (RRPV). We observe that delicately-tuned replacement policies rooted in single-bit RRPVs can closely approximate the performance of their correspondents with log associativity-bit RRPVs. Therefore, we propose a novel practical shared LLC design, called COOP, which entails a 1-bit RRPV per cacheline and a lightweight monitor per core for locality & utility co-optimization. At a considerably low storage cost, COOP achieves higher performance than the two recent practical replacement policies that rely on 2-bit RRPVs but are oriented towards locality optimization only.
Zhan, Dongyuan, "Spatiotemporal capacity management for the last level caches of chip multiprocessors" (2012). ETD collection for University of Nebraska - Lincoln. AAI3546215.