Paralellism-Based Techniques for Slowing Down Soft Error Propagation

Zuhal Ozturk, Haluk Rahmi Topcuoglu, Mahmut Taylan Kandemir

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Vulnerability of soft errors initiates various fault tolerance techniques on modern computing systems which can be implemented at hardware and software layers. While the fault tolerance techniques can improve the reliability, they introduce additional costs which may not be tolerable for some systems. There are several studies in the literature that target to reduce such additional costs. In this study, we monitor the soft error propagation throughout the execution and propose simple and relatively inexpensive methods to slow down the error propagation curves. Matrix multiplication is considered as the target multi-threaded application where we utilize parallelization-based versions including changing the number of threads and loop parallelization options. The fault injection experiments reveal that the utilized methods reshape the error propagation curves effectively. They can reshape the error propagation at runtime, where switching between different versions during operation helps balance reliability and performance and use the limited resources more efficiently at the same time.

Original languageEnglish (US)
Title of host publicationProceedings of the 2022 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2022
EditorsGiancarlo Fortino, Raffaele Gravina, Antonio Guerrieri, Claudio Savaglio
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665462976
DOIs
StatePublished - 2022
Event20th IEEE International Conference on Dependable, Autonomic and Secure Computing, 20th IEEE International Conference on Pervasive Intelligence and Computing, 7th IEEE International Conference on Cloud and Big Data Computing, 2022 IEEE International Conference on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2022 - Falerna, Italy
Duration: Sep 12 2022Sep 15 2022

Publication series

NameProceedings of the 2022 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2022

Conference

Conference20th IEEE International Conference on Dependable, Autonomic and Secure Computing, 20th IEEE International Conference on Pervasive Intelligence and Computing, 7th IEEE International Conference on Cloud and Big Data Computing, 2022 IEEE International Conference on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech 2022
Country/TerritoryItaly
CityFalerna
Period9/12/229/15/22

All Science Journal Classification (ASJC) codes

  • Management of Technology and Innovation
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Paralellism-Based Techniques for Slowing Down Soft Error Propagation'. Together they form a unique fingerprint.

Cite this