TY - GEN
T1 - Node failure localization via network tomography
AU - Ma, Liang
AU - He, Ting
AU - Swami, Ananthram
AU - Towsley, Don
AU - Leung, Kin K.
AU - Lowe, Jessica
N1 - Publisher Copyright:
Copyright © 2014 by the Association for Computing Machinery, Inc. (ACM).
PY - 2014/11/5
Y1 - 2014/11/5
N2 - We investigate the problem of localizing node failures in a communication network from end-to-end path measurements, under the assumption that a path behaves normally if and only if it does not contain any failed nodes. To uniquely localize node failures, the measurement paths must show different symptoms under different failure events, i.e., for any two distinct sets of failed nodes, there must be a measurement path traversing one and only one of them. This condition is, however, impractical to test for large networks. Our first contribution is a characterization of this condition in terms of easily verifiable conditions on the network topology with given monitor placements under three families of probing mechanisms, which differ in whether measurement paths are (i) arbitrarily controllable, (ii) controllable but cycle-free, or (iii) uncontrollable (i.e., determined by the default routing protocol). Our second contribution is a characterization of the maximum identifiability of node failures, measured by the maximum number of simultaneous failures that can always be uniquely localized. Specifically, we bound the maximal identifiability from both the upper and the lower bounds which differ by at most one, and show that these bounds can be evaluated in polynomial time. Finally, we quantify the impact of the probing mechanism on the capability of node failure localization under different probing mechanisms on both random and real network topologies. We observe that despite a higher implementation cost, probing along controllable paths can significantly improve a network's capability to localize simultaneous node failures.
AB - We investigate the problem of localizing node failures in a communication network from end-to-end path measurements, under the assumption that a path behaves normally if and only if it does not contain any failed nodes. To uniquely localize node failures, the measurement paths must show different symptoms under different failure events, i.e., for any two distinct sets of failed nodes, there must be a measurement path traversing one and only one of them. This condition is, however, impractical to test for large networks. Our first contribution is a characterization of this condition in terms of easily verifiable conditions on the network topology with given monitor placements under three families of probing mechanisms, which differ in whether measurement paths are (i) arbitrarily controllable, (ii) controllable but cycle-free, or (iii) uncontrollable (i.e., determined by the default routing protocol). Our second contribution is a characterization of the maximum identifiability of node failures, measured by the maximum number of simultaneous failures that can always be uniquely localized. Specifically, we bound the maximal identifiability from both the upper and the lower bounds which differ by at most one, and show that these bounds can be evaluated in polynomial time. Finally, we quantify the impact of the probing mechanism on the capability of node failure localization under different probing mechanisms on both random and real network topologies. We observe that despite a higher implementation cost, probing along controllable paths can significantly improve a network's capability to localize simultaneous node failures.
UR - http://www.scopus.com/inward/record.url?scp=84910155814&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84910155814&partnerID=8YFLogxK
U2 - 10.1145/2663716.2663723
DO - 10.1145/2663716.2663723
M3 - Conference contribution
AN - SCOPUS:84910155814
T3 - Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC
SP - 195
EP - 208
BT - IMC 2014 - Proceedings of the 2014 ACM
PB - Association for Computing Machinery
T2 - 2014 ACM Internet Measurement Conference, IMC 2014
Y2 - 5 November 2014 through 7 November 2014
ER -