Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault Prediction

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Context: Software Engineering (SE) community has empirically investigated software defect prediction as a proxy to benchmark it as a process improvement activity to assure software quality. In the domain of software fault prediction, the performance of classification algorithms is highly provoked with the residual effects attributed to feature irrelevance and data redundancy issues. Problem: The meta-learning-based ensemble methods are usually carried out to mitigate these noise effects and boost the software fault prediction performance. However, there is a need to benchmark the performance of meta-learning ensemble methods (as fault predictor) to assure software quality control and aid developers in their decision making. Method: We conduct an empirical and comparative study to evaluate and benchmark the improvement in the fault prediction performance via meta-learning ensemble methods as compared to their component base-level fault predictors. In this study, we perform a series of experiments with four well-known meta-level ensemble methods Vote, StackingC (i.e., Stacking), MultiScheme, and Grading. We also use five high-performance fault predictors Logistic (i.e., Logistic Regression), J48 (i.e., Decision Tree), IBK (i.e. k-nearest neighbor), NaiveBayes, and Decision Table (DT). Subsequently, we performed these experiments on public defect datasets with k-fold (k=10) cross-validation. We used F-measure and ROC-AUC (Receiver Operating Characteristic-Area Under Curve) performance measures and applied the four non-parametric tests to benchmark the fault prediction performance results of meta-learning ensemble methods. Results and Conclusion: we conclude that meta-learning ensemble methods, especially Vote could outperform the base-level fault predictors to tackle the feature irrelevance and redundancy issues in the domain of software fault prediction. Having said that, their performance is highly related to the number of base-level classifiers and the set of software fault prediction metrics.

Original languageEnglish (US)
Title of host publicationProceedings of the ACM International Conference on Evaluation and Assessment in Software Engineering, EASE 2022
PublisherAssociation for Computing Machinery
Pages413-420
Number of pages8
ISBN (Electronic)9781450396134
DOIs
StatePublished - Jun 13 2022
Event26th ACM International Conference on Evaluation and Assessment in Software Engineering, EASE 2022 - Gothenburg, Sweden
Duration: Jun 13 2022Jun 15 2022

Publication series

NameACM International Conference Proceeding Series

Conference

Conference26th ACM International Conference on Evaluation and Assessment in Software Engineering, EASE 2022
Country/TerritorySweden
CityGothenburg
Period6/13/226/15/22

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault Prediction'. Together they form a unique fingerprint.

Cite this