Computer Science and Engineering, Department of
Date of this Version
Spring 1-31-2014
Abstract
The enormous amount of big data datasets impose the needs for effective data filtering technique to accelerate the analytics process. We propose a Versatile Searchable File System, VSFS, which provides a transparent, flexible and near real-time file-level data filtering service by searching files directly through the file system. Therefore, big data analytics applications can transparently utilize this filtering service without application modifications. A versatile index scheme is designed to adapt to the exploratory and ad-hoc nature of the big data analytics activities. Moreover, VSFS uses a RAM-based distributed architecture to perform file indexing. The evaluations driven by three real-world analytics applications demonstrate VSFS’ high scalability and data-filtering capability.
Included in
Computer and Systems Architecture Commons, Data Storage Systems Commons, OS and Networks Commons, Systems Architecture Commons
Comments
Copyright (c) 2014 Lei Xu, Ziling Huang, Hong Jiang, Lei Tian, and David Swanson.