CloudPD: Problem determination and diagnosis in shared dynamic clouds

Bikash Sharma, Praveen Jayachandran, Akshat Verma, Chita R. Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

59 Scopus citations

Abstract

In this work, we address problem determination in virtualized clouds. We show that high dynamism, resource sharing, frequent reconfiguration, high propensity to faults and automated management introduce significant new challenges towards fault diagnosis in clouds. Towards this, we propose CloudPD, a fault management framework for clouds. CloudPD leverages (i) a canonical representation of the operating environment to quantify the impact of sharing; (ii) an online learning process to tackle dynamism; (iii) a correlation-based performance models for higher detection accuracy; and (iv) an integrated end-to-end feedback loop to synergize with a cloud management ecosystem. Using a prototype implementation with cloud representative batch and transactional workloads like Hadoop, Olio and RUBiS, it is shown that CloudPD detects and diagnoses faults with low false positives (< 16%) and high accuracy of 88%, 83% and 83%, respectively. In an enterprise trace-based case study, CloudPD diagnosed anomalies within 30 seconds and with an accuracy of 77%, demonstrating its effectiveness in real-life operations.

Original languageEnglish (US)
Title of host publication2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2013
DOIs
StatePublished - 2013
Event2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2013 - Budapest, Hungary
Duration: Jun 24 2013Jun 27 2013

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks

Other

Other2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2013
Country/TerritoryHungary
CityBudapest
Period6/24/136/27/13

All Science Journal Classification (ASJC) codes

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'CloudPD: Problem determination and diagnosis in shared dynamic clouds'. Together they form a unique fingerprint.

Cite this