TY - JOUR
T1 - Lessons for machine learning from the analysis of porosity-permeability transforms for carbonate reservoirs
AU - Male, Frank
AU - Duncan, Ian J.
N1 - Funding Information:
We are grateful to F. Jerry Lucia for providing the raw data, thin sections and image analyses that he used in his 1995 paper. Vinyet Baqués generously provided porosity, permeability, and facies data from the SSAU reservoir as well as providing useful discussion. Statistical analysis was performed in the R language (R Core Team, 2014). Plots were generated using the GGPlot2 package ( Wickham, 2009 ). This study was funded by the US Department of Energy (DOE) grant FE0024375 (PI: Duncan). The authors are in debt to Drs. Jerry Jensen, Larry Lake, Carlos Torres-Verdín, and Bo Ren for valuable conversations and feedback. Appendix A
Funding Information:
We are grateful to F. Jerry Lucia for providing the raw data, thin sections and image analyses that he used in his 1995 paper. Vinyet Baqu?s generously provided porosity, permeability, and facies data from the SSAU reservoir as well as providing useful discussion. Statistical analysis was performed in the R language (R Core Team, 2014). Plots were generated using the GGPlot2 package (Wickham, 2009). This study was funded by the US Department of Energy (DOE) grant FE0024375 (PI: Duncan). The authors are in debt to Drs. Jerry Jensen, Larry Lake, Carlos Torres-Verd?n, and Bo Ren for valuable conversations and feedback.
Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2020/4
Y1 - 2020/4
N2 - Prediction of permeability is one of the most difficult aspects of reservoir characterization because permeability cannot be directly measured by current well logging technology. This is particularly challenging for carbonate rocks. Machine learning (ML) and robust multivariate methods have been developed that have been used in many fields of study to make accurate estimators for variables of interest from both large and small datasets. ML has been criticized for utilizing approaches that are typically not interpretable. That is, it is not clear how the answers are arrived at and what aspects of input data may be resulting in inaccurate results. The current study uses a number of the mathematical algorithms that operate inside ML modules. It applies them to developing porosity-permeability transforms, with or without rock types, to two well-characterized data sets for carbonate reservoirs. One data set is from Jerry Lucia's 1995 study of carbonate rock types, and the other is from a study of the Seminole, West Texas, San Andres Unit. This study of statistical analysis of porosity-permeability transforms includes: transforming the data to normal distributions; performing cross-validation blind testing; and detecting heteroscedasticity by creating plots of residuals. Heteroscedastic data (populations with variable variance) may have an adverse impact on ML algorithms such as Random Forests (RF). We find that including lithofacies information does not greatly improve porosity-permeability transforms. We also propose a number of strategies to make ML analyses of reservoir (and other geosciences) data sets more robust and accurate.
AB - Prediction of permeability is one of the most difficult aspects of reservoir characterization because permeability cannot be directly measured by current well logging technology. This is particularly challenging for carbonate rocks. Machine learning (ML) and robust multivariate methods have been developed that have been used in many fields of study to make accurate estimators for variables of interest from both large and small datasets. ML has been criticized for utilizing approaches that are typically not interpretable. That is, it is not clear how the answers are arrived at and what aspects of input data may be resulting in inaccurate results. The current study uses a number of the mathematical algorithms that operate inside ML modules. It applies them to developing porosity-permeability transforms, with or without rock types, to two well-characterized data sets for carbonate reservoirs. One data set is from Jerry Lucia's 1995 study of carbonate rock types, and the other is from a study of the Seminole, West Texas, San Andres Unit. This study of statistical analysis of porosity-permeability transforms includes: transforming the data to normal distributions; performing cross-validation blind testing; and detecting heteroscedasticity by creating plots of residuals. Heteroscedastic data (populations with variable variance) may have an adverse impact on ML algorithms such as Random Forests (RF). We find that including lithofacies information does not greatly improve porosity-permeability transforms. We also propose a number of strategies to make ML analyses of reservoir (and other geosciences) data sets more robust and accurate.
UR - http://www.scopus.com/inward/record.url?scp=85076915338&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076915338&partnerID=8YFLogxK
U2 - 10.1016/j.petrol.2019.106825
DO - 10.1016/j.petrol.2019.106825
M3 - Article
AN - SCOPUS:85076915338
SN - 0920-4105
VL - 187
JO - Journal of Petroleum Science and Engineering
JF - Journal of Petroleum Science and Engineering
M1 - 106825
ER -