Reliable distributed diagnosis for multiprocessor systems with random faults

Piotr Berman, Andrzej Pelc

    Research output: Contribution to journalArticlepeer-review

    2 Scopus citations

    Abstract

    We study a probabilistic setting for distributed fault diagnosis in multiprocessor systems. A system is an undirected graph with nodes representing processors and edges representing communication links. Processors are assumed to fail independently with some probability p. They test their neighbors, and a fault‐free processor has probability 1 − q of discovering a fault of a failed neighbor in an individual test. Subsequently, fault‐free processors attempt to diagnose all the processors of the system with communication based on the test results. During communication, the behavior of faulty processors may be arbitrary (socalled malicious). For every p ≤ ½, q ≤ 1, we construct systems with O(n log n) links in which distributed probabilistic diagnosis can be achieved with probability of correctness at least 1 − n−1. We also show that for some small fixed p and q a similar result holds for the hypercube. On the other hand, we prove that for sufficiently small k, for a system with n processors and kn log n links, the probability of achieving correct diagnosis cannot exceed n−0.5. © 1994 by John Wiley & Sons, Inc.

    Original languageEnglish (US)
    Pages (from-to)417-427
    Number of pages11
    JournalNetworks
    Volume24
    Issue number8
    DOIs
    StatePublished - Dec 1994

    All Science Journal Classification (ASJC) codes

    • Information Systems
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'Reliable distributed diagnosis for multiprocessor systems with random faults'. Together they form a unique fingerprint.

    Cite this