Computer Science and Engineering, Department of

 

Date of this Version

Spring 1-31-2014

Comments

Copyright (c) 2014 Lei Xu, Ziling Huang, Hong Jiang, Lei Tian, and David Swanson.

Abstract

The enormous amount of big data datasets impose the needs for effective data filtering technique to accelerate the analytics process. We propose a Versatile Searchable File System, VSFS, which provides a transparent, flexible and near real-time file-level data filtering service by searching files directly through the file system. Therefore, big data analytics applications can transparently utilize this filtering service without application modifications. A versatile index scheme is designed to adapt to the exploratory and ad-hoc nature of the big data analytics activities. Moreover, VSFS uses a RAM-based distributed architecture to perform file indexing. The evaluations driven by three real-world analytics applications demonstrate VSFS’ high scalability and data-filtering capability.

Share

COinS