Adaptive Method for Machine Learning Model Selection in Data Science Projects

Cristina Tavares, Nathalia Nascimento, Paulo Alencar, Donald Cowan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Scopus citations

    Abstract

    Data science projects involve a machine learning (ML) process based on data, code, and models that change over time. For example, the datasets may increase in size and allow an ML model that requires larger datasets to be applied. However, the dynamic factors that influence model selection are not well understood and explicitly represented. This paper presents ongoing work on an adaptive method for ML model selection in big data science projects. The proposed method involves (i) identifying the factors that affect model selection based on heuristics proposed in the literature; and (ii) modeling the variability of these factors using a feature diagram and constraints that trigger adaptive reconfiguration, that is, changes in model selection due to changes in the variability factors. The applicability of the method is demonstrated through an illustrative use case. The proposed method can lead to an improved understanding of dynamic factors that influence model selection, how these factors explicitly affect the selection, and how the adaptive factors can be represented and automated. This improved understanding can result in a project model selection process that is less implicit and more efficient, more adaptive and explainable, and ultimately constitute a foundation for the creation of novel dynamic software product lines to support this process.

    Original languageEnglish (US)
    Title of host publicationProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
    EditorsShusaku Tsumoto, Yukio Ohsawa, Lei Chen, Dirk Van den Poel, Xiaohua Hu, Yoichi Motomura, Takuya Takagi, Lingfei Wu, Ying Xie, Akihiro Abe, Vijay Raghavan
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages2682-2688
    Number of pages7
    ISBN (Electronic)9781665480451
    DOIs
    StatePublished - 2022
    Event2022 IEEE International Conference on Big Data, Big Data 2022 - Osaka, Japan
    Duration: Dec 17 2022Dec 20 2022

    Publication series

    NameProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022

    Conference

    Conference2022 IEEE International Conference on Big Data, Big Data 2022
    Country/TerritoryJapan
    CityOsaka
    Period12/17/2212/20/22

    All Science Journal Classification (ASJC) codes

    • Modeling and Simulation
    • Computer Networks and Communications
    • Information Systems
    • Information Systems and Management
    • Safety, Risk, Reliability and Quality
    • Control and Optimization

    Fingerprint

    Dive into the research topics of 'Adaptive Method for Machine Learning Model Selection in Data Science Projects'. Together they form a unique fingerprint.

    Cite this