"DEBAR: A Scalable High-Performance De-duplication Storage System for B" by Tianming Yang, Hong Jiang et al.

Computer Science and Engineering, Department of

 

Date of this Version

1-5-2009

Comments

University of Nebraska–Lincoln, Computer Science and Engineering
Technical Report TR-UNL-CSE-2009-0004
Issued Jan. 5, 2009

Abstract

We present DEBAR, a scalable and high-performance de-duplication storage system for backup and archiving, to overcome the throughput and scalability limitations of the state-of-the-art data de-duplication schemes, including the Data Domain De-duplication File System (DDFS). DEBAR uses a two-phase de-duplication scheme (TPDS) that exploits memory cache and disk index properties to judiciously turn the notoriously random and small disk I/Os of fingerprint lookups and updates into large sequential disk I/Os, hence achieving a very high de-duplication throughput. The salient feature of this approach is that both the system backup and archiving capacity and the de-duplication performance can be dynamically and cost-effectively scaled up on demand; it hence not only significantly improves the throughput of a single de-duplication server but also is conducive to distributed implementation and thus applicable to large-scale and distributed storage systems.

Plum Print visual indicator of research metrics
PlumX Metrics
  • Citations
    • Patent Family Citations: 6
  • Usage
    • Downloads: 2095
    • Abstract Views: 156
see details

Share

COinS