Distributed probabilistic fault diagnosis for multiprocessor systems

Piotr Berman, Andrzej Pelc

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    39 Scopus citations

    Abstract

    A class of n-unit multiprocessor systems with O(n log n) interconnecting links is constructed, and a distributed probabilistic fault diagnosis algorithm whose probability of correctness converges to 1 as n → ∞ is proposed. For small probability of unit failure, a distributed diagnosis whose probability also converges to 1 as the size of the system grows is proposed for the hypercube. On the other hand, it is proved that if a class of systems has fewer than kn log n links for a small constant k, the probability of correctness of every fault diagnosis converges to 0 as n → ∞. By combining the probabilistic and the distributed approach the authors' model of fault diagnosis removes the major drawbacks of the PMC (Preparata-Metze-Chien) model: the assumption of tests with complete fault coverage and the assumption of a fault-free central monitoring unit capable of performing diagnosis.

    Original languageEnglish (US)
    Title of host publicationDigest of Papers - FTCS (Fault-Tolerant Computing Symposium)
    PublisherPubl by IEEE
    Pages340-346
    Number of pages7
    ISBN (Print)081862051X
    StatePublished - 1990
    Event20th International Symposium on Fault-Tolerant Computing - FTCS 20 - Chapel Hill, NC, USA
    Duration: Jun 26 1990Jun 28 1990

    Publication series

    NameDigest of Papers - FTCS (Fault-Tolerant Computing Symposium)
    ISSN (Print)0731-3071

    Other

    Other20th International Symposium on Fault-Tolerant Computing - FTCS 20
    CityChapel Hill, NC, USA
    Period6/26/906/28/90

    All Science Journal Classification (ASJC) codes

    • Hardware and Architecture

    Fingerprint

    Dive into the research topics of 'Distributed probabilistic fault diagnosis for multiprocessor systems'. Together they form a unique fingerprint.

    Cite this