TY - GEN
T1 - Trimming the Tail for Deterministic Read Performance in SSDs
AU - Elyasi, Nima
AU - Choi, Changho
AU - Sivasubramaniam, Anand
AU - Yang, Jingpei
AU - Balakrishnan, Vijay
N1 - Funding Information:
ACKNOWLEDGEMENTS This research has been funded in part by NSF grants 1526750, 1629129,1714389,1763681, 1909004 and a DARPNSRC JUMP award.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - With SSDs becoming commonplace in several customer-facing datacenter applications, there is a critical need for optimizing for tail latencies (particularly reads). In this paper, we conduct a systematic analysis, removing one bottleneck after another, to study the root causes behind long tail latencies on a state-of-the-art high-end SSD. Contrary to a lot of prior observations, we find that Garbage Collection (GC) is not a key contributor, and it is more the variances in queue lengths across the flash chips that is the culprit. Particularly, reads waiting for long latency writes, which has been the target for much study, is at the root of this problem. While write pausing/preemption has been proposed as a remedy, in this paper we explore a more simple and alternate solution that leverages existing RAID groups into which flash chips are organized. While a long latency operation is ongoing, rather than waiting, the read could get its data by reconstructing it from the remaining chips of that group (including parity). However, this introduces additional reads, and we propose an adaptive scheduler called ATLAS that dynamically figures out whether to wait or to reconstruct the data from other chips. The resulting ATLAS optimization cuts the 99.99th percentile read latency by as much as 10X, with a reduction of 4X on the average across a wide spectrum of workloads.
AB - With SSDs becoming commonplace in several customer-facing datacenter applications, there is a critical need for optimizing for tail latencies (particularly reads). In this paper, we conduct a systematic analysis, removing one bottleneck after another, to study the root causes behind long tail latencies on a state-of-the-art high-end SSD. Contrary to a lot of prior observations, we find that Garbage Collection (GC) is not a key contributor, and it is more the variances in queue lengths across the flash chips that is the culprit. Particularly, reads waiting for long latency writes, which has been the target for much study, is at the root of this problem. While write pausing/preemption has been proposed as a remedy, in this paper we explore a more simple and alternate solution that leverages existing RAID groups into which flash chips are organized. While a long latency operation is ongoing, rather than waiting, the read could get its data by reconstructing it from the remaining chips of that group (including parity). However, this introduces additional reads, and we propose an adaptive scheduler called ATLAS that dynamically figures out whether to wait or to reconstruct the data from other chips. The resulting ATLAS optimization cuts the 99.99th percentile read latency by as much as 10X, with a reduction of 4X on the average across a wide spectrum of workloads.
UR - http://www.scopus.com/inward/record.url?scp=85083103702&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85083103702&partnerID=8YFLogxK
U2 - 10.1109/IISWC47752.2019.9042073
DO - 10.1109/IISWC47752.2019.9042073
M3 - Conference contribution
AN - SCOPUS:85083103702
T3 - Proceedings of the 2019 IEEE International Symposium on Workload Characterization, IISWC 2019
SP - 49
EP - 58
BT - Proceedings of the 2019 IEEE International Symposium on Workload Characterization, IISWC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE International Symposium on Workload Characterization, IISWC 2019
Y2 - 3 November 2019 through 5 November 2019
ER -