"Propeller: A Scalable Real-Time File-Search Service in Distributed Sy" by Lei Xu, Hong Jiang et al.

Computer Science and Engineering, Department of

 

Date of this Version

2014

Citation

2014 IEEE 34th International Conference on Distributed Computing Systems, Pages: 378 - 388, DOI: 10.1109/ICDCS.2014.46

Comments

Copyright © 2014 IEEE. Used by permission.

Abstract

File-search service is a valuable facility to accelerate many analytics applications, because it can drastically reduce the scale of the input data. The main challenge facing the design of large-scale and accurate file-search services is how to support real-time indexing in an efficient and scalable way. To address this challenge, we propose a distributed file-search service, called Propeller, which utilizes a special file-access pattern, called access-causality, to partition file-indices in order to expose substantial access locality and parallelism to accelerate the file-indexing process. The extensive evaluations of Propeller show that it is realtime in file-indexing operations, accurate in file-search results, and scalable in large datasets. It achieves significantly better file-indexing and file-search performance (up to 250×) than a centralized solution (MySQL) and much higher accuracy and substantially lower query latency (up to 22×) than a state-ofthe- art desktop search engine (Spotlight).

Share

COinS