Mutable checkpoints: A new checkpointing approach for mobile computing systems

Guohong Cao, Mukesh Singhal

Research output: Contribution to journalArticlepeer-review

110 Scopus citations

Abstract

Mobile computing raises many new issues such as lack of stable storage, low bandwidth of wireless channel, high mobility, and limited battery life. These new issues make traditional checkpointing algorithms unsuitable. Coordinated checkpointing is an attractive approach for transparently adding fault tolerance to distributed applications since it avoids domino effects and minimizes the stable storage requirement. However, it suffers from high overhead associated with the checkpointing process in mobile computing systems. Two approaches have been used to reduce the overhead: First is to minimize the number of synchronization messages and the number of checkpoints; the other is to make the checkpointing process nonblocking. These two approaches were orthogonal previously until the Prakash-Singhal algorithm [28] combined them. However, we [8] found that this algorithm may result in an inconsistency in some situations and we proved that there does not exist a nonblocking algorithm which forces only a minimum number of processes to take their checkpoints. In this paper, we introduce the concept of "mutable checkpoint," which is neither a tentative checkpoint nor a permanent checkpoint, to design efficient checkpointing algorithms for mobile computing systems. Mutable checkpoints can be saved anywhere, e.g., the main memory or local disk of MHs. In this way, taking a mutable checkpoint avoids the overhead of transferring large amounts of data to the stable storage at MSSs over the wireless network. We present techniques to minimize the number of mutable checkpoints. Simulation results show that the overhead of taking mutable checkpoints is negligible. Based on mutable checkpoints, our nonblocking algorithm avoids the avalanche effect and forces only a minimum number of processes to take their checkpoints on the stable storage.

Original languageEnglish (US)
Pages (from-to)157-172
Number of pages16
JournalIEEE Transactions on Parallel and Distributed Systems
Volume12
Issue number2
DOIs
StatePublished - Feb 2001

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Mutable checkpoints: A new checkpointing approach for mobile computing systems'. Together they form a unique fingerprint.

Cite this