SSD Failures in datacenters: What, when and why?

Iyswarya Narayanan, Di Wang, Myeongjae Jeon, Bikash Sharma, Laura Caulfield, Anand Sivasubramaniam, Ben Cutler, Jie Liu, Badriddine Khessib, Kushagra Vaid

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Despite the growing popularity of Solid State Disks (SSDs) in the datacenter, little is known about their reliability characteristics in the field. The little knowledge is mainly vendor supplied, which cannot really help understand how SSD failures can manifest and impact production systems, in order to take appropriate actions. Besides failure data, a detailed characterization requires wide spectrum of data about factors influencing SSD failures, right from provisioning (what models? where and when deployed? etc.) to the operational ones (workloads, read-write intensities, write amplification, etc.). We analyze over half a million SSDs that span multiple generations spread across several datacenters which host a wide range of workloads over nearly 3 years. By studying the diverse set of factors on SSD failures, and their symptoms, our work provides the first look at the what, when and why characteristics of SSD failures in production datacenters.

Original languageEnglish (US)
Title of host publicationSIGMETRICS/ Performance 2016 - Proceedings of the SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Science
PublisherAssociation for Computing Machinery, Inc
Pages407-408
Number of pages2
ISBN (Electronic)9781450342667
DOIs
StatePublished - Jun 14 2016
Event13th Joint International Conference on Measurement and Modeling of Computer Systems, ACM SIGMETRICS / IFIP Performance 2016 - Antibes Juan-les-Pins, France
Duration: Jun 14 2016Jun 18 2016

Publication series

NameSIGMETRICS/ Performance 2016 - Proceedings of the SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Science

Other

Other13th Joint International Conference on Measurement and Modeling of Computer Systems, ACM SIGMETRICS / IFIP Performance 2016
Country/TerritoryFrance
CityAntibes Juan-les-Pins
Period6/14/166/18/16

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computational Theory and Mathematics
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'SSD Failures in datacenters: What, when and why?'. Together they form a unique fingerprint.

Cite this