TY - GEN
T1 - Detecting and localizing large-scale router failures using active probes
AU - Zheng, Qiang
AU - Cao, Guohong
AU - La Porta, Tom
AU - Swami, Ananthram
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - Detecting the occurrence of large-scale router failures and localizing the failed routers are critical to enhancing network reliability. We propose a two-phase approach for detecting and localizing large-scale router failures using traceroute-like active probes. To detect large-scale router failures, the detection phase is periodically invoked to probe all routers. When detecting large-scale router failures, the localization phase is triggered to identify the failed routers.We reduce the probing cost by avoiding three types of useless probes. For the routers whose status cannot be identified by probes, we develop a distance based method to estimate their failure probability. Experimental results based on ISP topologies show that the accuracy of our approach is higher than 96.5%, even when only 10% of routers are connected by end systems for probing. Compared with prior works, the proposed approach achieves much higher accuracy with lower probing cost.
AB - Detecting the occurrence of large-scale router failures and localizing the failed routers are critical to enhancing network reliability. We propose a two-phase approach for detecting and localizing large-scale router failures using traceroute-like active probes. To detect large-scale router failures, the detection phase is periodically invoked to probe all routers. When detecting large-scale router failures, the localization phase is triggered to identify the failed routers.We reduce the probing cost by avoiding three types of useless probes. For the routers whose status cannot be identified by probes, we develop a distance based method to estimate their failure probability. Experimental results based on ISP topologies show that the accuracy of our approach is higher than 96.5%, even when only 10% of routers are connected by end systems for probing. Compared with prior works, the proposed approach achieves much higher accuracy with lower probing cost.
UR - http://www.scopus.com/inward/record.url?scp=84856976110&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84856976110&partnerID=8YFLogxK
U2 - 10.1109/MILCOM.2011.6127458
DO - 10.1109/MILCOM.2011.6127458
M3 - Conference contribution
AN - SCOPUS:84856976110
SN - 9781467300810
T3 - Proceedings - IEEE Military Communications Conference MILCOM
SP - 1170
EP - 1175
BT - 2010 Military Communications Conference, MILCOM 2010
T2 - 2011 IEEE Military Communications Conference, MILCOM 2011
Y2 - 7 November 2011 through 10 November 2011
ER -