Extending Variability-Aware Model Selection with Bias Detection in Machine Learning Projects

Cristina Tavares, Nathalia Nascimento, Paulo Alencar, Donald Cowan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Data science projects often involve various machine learning (ML) methods that depend on data, code, and models. One of the key activities in these projects is the selection of a model or algorithm that is appropriate for the data analysis at hand. ML model selection depends on several factors, which include data-related attributes such as sample size, functional requirements such as the prediction algorithm type, and nonfunctional requirements such as performance and bias. However, the factors that influence such selection are often not well understood and explicitly represented. This paper describes ongoing work on extending an adaptive variability-aware model selection method with bias detection in ML projects. The method involves: (i) modeling the variability of the factors that affect model selection using feature models based on heuristics proposed in the literature; (ii) instantiating our variability model with added features related to bias (e.g., bias-related metrics); and (iii) conducting experiments that illustrate the method in a specific case study to illustrate our approach based on a heart failure prediction project. The proposed approach aims to advance the state of the art by making explicit factors that influence model selection, particularly those related to bias, as well as their interactions. The provided representations can transform model selection in ML projects into a non ad hoc, adaptive, and explainable process.

    Original languageEnglish (US)
    Title of host publicationProceedings - 2023 IEEE International Conference on Big Data, BigData 2023
    EditorsJingrui He, Themis Palpanas, Xiaohua Hu, Alfredo Cuzzocrea, Dejing Dou, Dominik Slezak, Wei Wang, Aleksandra Gruca, Jerry Chun-Wei Lin, Rakesh Agrawal
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages2441-2449
    Number of pages9
    ISBN (Electronic)9798350324457
    DOIs
    StatePublished - 2023
    Event2023 IEEE International Conference on Big Data, BigData 2023 - Sorrento, Italy
    Duration: Dec 15 2023Dec 18 2023

    Publication series

    NameProceedings - 2023 IEEE International Conference on Big Data, BigData 2023

    Conference

    Conference2023 IEEE International Conference on Big Data, BigData 2023
    Country/TerritoryItaly
    CitySorrento
    Period12/15/2312/18/23

    All Science Journal Classification (ASJC) codes

    • Artificial Intelligence
    • Computer Networks and Communications
    • Computer Science Applications
    • Information Systems
    • Information Systems and Management
    • Safety, Risk, Reliability and Quality

    Fingerprint

    Dive into the research topics of 'Extending Variability-Aware Model Selection with Bias Detection in Machine Learning Projects'. Together they form a unique fingerprint.

    Cite this