Cloud-based data stores employ data replication combined with algorithms that allow consistent, concurrent access to data to offer low response times to their global client-bases. Erasure coding (EC) is a generalization of replication known to be more storage efficient for the same fault tolerance in theory. However, reaping the benefits of EC for geo-distributed data stores in practice poses several new scientific challenges. The overarching goal of this project is to address these challenges by developing a comprehensive understanding of the response time vs. cost trade-off for EC-based consistent geo-distributed stores. The research will be conducted in the following four thrusts: (i) development of EC techniques and distributed algorithms to satisfy consistency criteria known as eventual and causal consistency, (ii) development of efficient, EC-compatible reconfiguration algorithms that provably maintain consistency despite changes in object configurations, (ii) development of analytical models of performance and cost, and associated optimization frameworks, and (iv) integration of developed techniques into Apache Cassandra. The proposed research will be conducted in the context of the public cloud, where inter-DC latencies and pricing information are readily available allowing competing schemes to be compared in a fair manner.The proposed research could lead to cost reductions for geo-distributed data stores hosted on public clouds, which are fundamental building blocks for several applications including collaborative editing, social media, financial transactions, reservation systems, and multi-player gaming. The validation plan involves experiments performed using an Apache Cassandra-based prototype that will be made open source. The prototype will serve as a testbed for other researchers and engineers to plug in their own algorithms and compare findings. The proposed research has a cross-disciplinary nature and will be supplemented with an education plan that involves development of survey articles and curricular integration.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|Effective start/end date
|8/1/22 → 7/31/25
- National Science Foundation: $593,845.00
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.