TY - GEN
T1 - Empirical Investigation of role of Meta-learning approaches for the Improvement of Software Development Process via Software Fault Prediction
AU - Hussain, Shahid
AU - Ibrahim, Naseem
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/6/13
Y1 - 2022/6/13
N2 - Context: Software Engineering (SE) community has empirically investigated software defect prediction as a proxy to benchmark it as a process improvement activity to assure software quality. In the domain of software fault prediction, the performance of classification algorithms is highly provoked with the residual effects attributed to feature irrelevance and data redundancy issues. Problem: The meta-learning-based ensemble methods are usually carried out to mitigate these noise effects and boost the software fault prediction performance. However, there is a need to benchmark the performance of meta-learning ensemble methods (as fault predictor) to assure software quality control and aid developers in their decision making. Method: We conduct an empirical and comparative study to evaluate and benchmark the improvement in the fault prediction performance via meta-learning ensemble methods as compared to their component base-level fault predictors. In this study, we perform a series of experiments with four well-known meta-level ensemble methods Vote, StackingC (i.e., Stacking), MultiScheme, and Grading. We also use five high-performance fault predictors Logistic (i.e., Logistic Regression), J48 (i.e., Decision Tree), IBK (i.e. k-nearest neighbor), NaiveBayes, and Decision Table (DT). Subsequently, we performed these experiments on public defect datasets with k-fold (k=10) cross-validation. We used F-measure and ROC-AUC (Receiver Operating Characteristic-Area Under Curve) performance measures and applied the four non-parametric tests to benchmark the fault prediction performance results of meta-learning ensemble methods. Results and Conclusion: we conclude that meta-learning ensemble methods, especially Vote could outperform the base-level fault predictors to tackle the feature irrelevance and redundancy issues in the domain of software fault prediction. Having said that, their performance is highly related to the number of base-level classifiers and the set of software fault prediction metrics.
AB - Context: Software Engineering (SE) community has empirically investigated software defect prediction as a proxy to benchmark it as a process improvement activity to assure software quality. In the domain of software fault prediction, the performance of classification algorithms is highly provoked with the residual effects attributed to feature irrelevance and data redundancy issues. Problem: The meta-learning-based ensemble methods are usually carried out to mitigate these noise effects and boost the software fault prediction performance. However, there is a need to benchmark the performance of meta-learning ensemble methods (as fault predictor) to assure software quality control and aid developers in their decision making. Method: We conduct an empirical and comparative study to evaluate and benchmark the improvement in the fault prediction performance via meta-learning ensemble methods as compared to their component base-level fault predictors. In this study, we perform a series of experiments with four well-known meta-level ensemble methods Vote, StackingC (i.e., Stacking), MultiScheme, and Grading. We also use five high-performance fault predictors Logistic (i.e., Logistic Regression), J48 (i.e., Decision Tree), IBK (i.e. k-nearest neighbor), NaiveBayes, and Decision Table (DT). Subsequently, we performed these experiments on public defect datasets with k-fold (k=10) cross-validation. We used F-measure and ROC-AUC (Receiver Operating Characteristic-Area Under Curve) performance measures and applied the four non-parametric tests to benchmark the fault prediction performance results of meta-learning ensemble methods. Results and Conclusion: we conclude that meta-learning ensemble methods, especially Vote could outperform the base-level fault predictors to tackle the feature irrelevance and redundancy issues in the domain of software fault prediction. Having said that, their performance is highly related to the number of base-level classifiers and the set of software fault prediction metrics.
UR - http://www.scopus.com/inward/record.url?scp=85132364987&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132364987&partnerID=8YFLogxK
U2 - 10.1145/3530019.3531333
DO - 10.1145/3530019.3531333
M3 - Conference contribution
AN - SCOPUS:85132364987
T3 - ACM International Conference Proceeding Series
SP - 413
EP - 420
BT - Proceedings of the ACM International Conference on Evaluation and Assessment in Software Engineering, EASE 2022
PB - Association for Computing Machinery
T2 - 26th ACM International Conference on Evaluation and Assessment in Software Engineering, EASE 2022
Y2 - 13 June 2022 through 15 June 2022
ER -