On optimal monitor placement for localizing node failures via network tomography

Liang Ma, Ting He, Ananthram Swami, Don Towsley, Kin K. Leung

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

Abstract We investigate the problem of placing monitors to localize node failures in a communication network from binary states (normal/failed) of end-to-end paths, under the assumption that a path is in normal state if and only if it contains no failed nodes. To uniquely localize failed nodes, the measurement paths must show different symptoms (path states) under different failure events. Our goal is to deploy the minimum set of monitors to satisfy this condition for a given probing mechanism. We consider three families of probing mechanisms, according to whether measurement paths are (i) arbitrarily controllable, (ii) controllable but cycle-free, or (iii) uncontrollable (i.e., determined by the default routing protocol). We first establish theoretical conditions that characterize network-wide failure identifiability through a per-node identifiability measure that can be efficiently evaluated for the above three probing mechanisms. Leveraging these results, we develop a generic monitor placement algorithm, applicable under any probing mechanism, that incrementally selects monitors to optimize the per-node measure. The proposed algorithm is shown to be optimal for probing mechanism (i), and provides upper and lower bounds on the minimum number of monitors required by the other probing mechanisms. In the special case of single-node failures, we develop an improved monitor placement algorithm that is optimal for probing mechanism (ii) and has linear time complexity. Using these algorithms, we study the impact of the probing mechanism on the number of monitors required for uniquely localizing node failures. Our results based on real network topologies show that although more complicated to implement, probing mechanisms that allow monitors to control measurement paths substantially reduce the required number of monitors.

Original languageEnglish (US)
Article number1818
Pages (from-to)16-37
Number of pages22
JournalPerformance Evaluation
Volume91
DOIs
StatePublished - Sep 1 2015

All Science Journal Classification (ASJC) codes

  • Software
  • Modeling and Simulation
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'On optimal monitor placement for localizing node failures via network tomography'. Together they form a unique fingerprint.

Cite this