REMAP: A reliability/endurance mechanism for advancing PCM

Mohammad Khavari Tavana, Amir Kavyan Ziabari, Mohammad Arjomand, Mahmut Kandemir, Chita Das, David Kaeli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Even given PCM's attractive features that include high scalability and lower power, write endurance remains a critical issue that impedes the move for this technology to replace DRAM in main memory systems. The wear-out problem is further exacerbated by advances in future technologies, where cell sizes are reduced and process variation increases. When using PCMs, worn-out cells are permanently stuck at either '0' or '1'. Successful adoption of PCM requires recovery from multiple stuck-at faults in a data block. Current error correction schemes for PCMs have limited capabilities to tolerate faults. In this paper we propose REMAP to improve the reliability of PCMs so that they can tolerate a large number of hard faults. In contrast to previous schemes, REMAP uses all the metadata space for replacing faulty bits - error detection and location information is not needed. The detection and location of failed memory cells are identified by read verification and an extra write operation. Despite tolerating many hard errors, employing REMAP can negatively impact cell lifetime due to the extra writes.We propose solutions to alleviate this problem and increase memory lifetime significantly. REMAP performs write endurance localization using both static and dynamic partitioning. Additionally, fault location caching is used to avoid the extra write overhead. Given the error correction capabilities of REMAP, we consider using it as a second layer of defense that is combined with other schemes. Our evaluation, which includes both Monte Carlo and trace-driven simulation, shows that REMAP is capable of boosting the PCM lifetime by 56% on average (up to 78%) as compared to our baseline.

Original languageEnglish (US)
Title of host publicationMEMSYS 2017 - Proceedings of the International Symposium on Memory Systems
PublisherAssociation for Computing Machinery
Pages385-398
Number of pages14
ISBN (Electronic)9781450353359
DOIs
StatePublished - Oct 2 2017
Event2017 International Symposium on Memory Systems, MEMSYS 2017 - Washington, United States
Duration: Oct 2 2017Oct 5 2017

Publication series

NameACM International Conference Proceeding Series
VolumePart F131197

Other

Other2017 International Symposium on Memory Systems, MEMSYS 2017
Country/TerritoryUnited States
CityWashington
Period10/2/1710/5/17

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'REMAP: A reliability/endurance mechanism for advancing PCM'. Together they form a unique fingerprint.

Cite this