Optimal recovery from large-scale failures in IP networks

Qiang Zheng, Guohong Cao, Tom La Porta, Ananthram Swami

Research output: Contribution to conferencePaperpeer-review

13 Scopus citations


Quickly recovering IP networks from failures is critical to enhancing Internet robustness and availability. Due to their serious impact on network routing, large-scale failures have received increasing attention in recent years. We propose an approach called Reactive Two-phase Rerouting (RTR) for intra-domain routing to quickly recover from large-scale failures with the shortest recovery paths. To recover a failed routing path, RTR first forwards packets around the failure area to collect information on failures. Then, in the second phase, RTR calculates a new shortest path and forwards packets along it through source routing. RTR can deal with large-scale failures associated with areas of any shape and location, and is free of permanent loops. For any failure area, the recovery paths provided by RTR are guaranteed to be the shortest. Extensive simulations based on ISP topologies show that RTR can find the shortest recovery paths for more than 98.6% of failed routing paths with reachable destinations. Compared with prior works, RTR achieves better performance for recoverable failed routing paths and uses much less network resources for irrecoverable failed routing paths.

Original languageEnglish (US)
Number of pages10
StatePublished - 2012
Event32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012 - Macau, China
Duration: Jun 18 2012Jun 21 2012


Other32nd IEEE International Conference on Distributed Computing Systems, ICDCS 2012

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications


Dive into the research topics of 'Optimal recovery from large-scale failures in IP networks'. Together they form a unique fingerprint.

Cite this