Computer Science and Engineering, Department of


Date of this Version



Published in IEEE TRANSACTIONS ON COMPUTERS, VOL. 59, NO. 1, JANUARY 2010; Digital Object Identifier no. 10.1109/TC.2009.115. Copyright 2010 IEEE; published by the IEEE Computer Society


Although data prefetching algorithms have been extensively studied for years, there is no counterpart research done for metadata access performance. Existing data prefetching algorithms, either lack of emphasis on group prefetching, or bearing a high level of computational complexity, do not work well with metadata prefetching cases. Therefore, an efficient, accurate, and distributed metadata-oriented prefetching scheme is critical to leverage the overall performance in large distributed storage systems. In this paper, we present a novel weighted-graph-based prefetching technique, built on both direct and indirect successor relationship, to reap performance benefit from prefetching specifically for clustered metadata servers, an arrangement envisioned necessary for petabyte-scale distributed storage systems. Extensive trace-driven simulations show that by adopting our new metadata prefetching algorithm, the miss rate for metadata accesses on the client site can be effectively reduced, while the average response time of metadata operations can be dramatically cut by up to 67 percent, compared with legacy LRU caching algorithm and existing state-of-the-art prefetching algorithms.