MACHINE LEARNING-BASED ESTIMATION OF OIL RECOVERY FACTOR USING XGBOOST: INSIGHTS FROM CLASSIFICATION AND DATA-DRIVEN ANALYSES

  • Alireza Roustazadeh
  • , Frank Male
  • , Behzad Ghanbarian
  • , Mohammad B. Shadmand
  • , Vahid Taslimitehrani
  • , Larry W. Lake

    Research output: Contribution to journalArticlepeer-review

    Abstract

    In petroleum engineering, it is essential to determine the ultimate recovery factor (RF) particularly before exploitation and exploration. However, accurately estimating requires data that may not be necessarily available or measured at early stages of reservoir development. To rectify this, we applied machine learning (ML) to estimate oil RF from readily available features. To construct the ML models, we applied the XGBoost classification algorithm. Classification was chosen over regression because recovery factor is bounded from 0 to 1, much like probability. Three databases with various reservoir properties and recovery factors were used, leaving us with four different combinations to first train and test the ML models and then further evaluate them using an independent database including unseen data. Cross-validation with ten folds was applied on the training datasets to assess the effectiveness of the models. To evaluate the accuracy and reliability of the models, the accuracy, within-1 accuracy, precision, recall, macro-averaged f1 score and R2 were determined. Overall, results showed that the XGBoost classification algorithm could estimate the RF class with accuracies as high as 0.77 in the training datasets, 0.36 in the testing datasets and 0.24 in the independent databases used. We found that the reliability of the XGBoost classification model depended on the data in the training dataset, indicating that the ML models were database dependent. The feature importance analysis and the Shapley Additive exPlanations (SHAP) approach showed that the most important features were reserves, reservoir area and thickness.

    Original languageEnglish (US)
    Article numberIPJ250825-4
    JournalInterPore Journal
    Volume2
    Issue number3
    DOIs
    StatePublished - Aug 25 2025

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 7 - Affordable and Clean Energy
      SDG 7 Affordable and Clean Energy

    All Science Journal Classification (ASJC) codes

    • Fluid Flow and Transfer Processes
    • Renewable Energy, Sustainability and the Environment
    • Artificial Intelligence
    • Energy Engineering and Power Technology

    Fingerprint

    Dive into the research topics of 'MACHINE LEARNING-BASED ESTIMATION OF OIL RECOVERY FACTOR USING XGBOOST: INSIGHTS FROM CLASSIFICATION AND DATA-DRIVEN ANALYSES'. Together they form a unique fingerprint.

    Cite this