Computing, School of
School of Computing: Technical Reports
Accessibility Remediation
If you are unable to use this item in its current form due to accessibility barriers, you may request remediation through our remediation request form.
Date of this Version
Spring 4-20-2013
Document Type
Article
Citation
Department of Computer Science & Engineering, University of Nebraska-Lincoln, Technical Report, 2013.
Abstract
Big-data/HPC analytics applications have urgent needs for file-search services to drastically reduce the scale of the input data to accelerate analytics. Unfortunately, the existing solutions either are poorly scalable for large-scale systems, or lack well-integrated interface to allow applications to easily use them. We propose a distributed searchable file system, VSFS, which provide a novel and flexible POSIX-compatible searchable file system namespace that can be seamlessly integrate with any legacy code without modification. Additionally, to provide real-time indexing and searching performance, VSFS uses DRAM-based distributed consistent hashing ring to manages all file-index. The results of our evaluation show that VSFS is scalable in HPC environment. It achieves significant better file-indexing and file-searching performance than the popular SQL/NoSQL solutions, while it only introduces negligible overheads to raw I/O performance. Finally, we integrate the VSFS to a scientific analytic application to show its benefits in terms of performance and convenience.