TY - JOUR
T1 - Statistical Models for Predicting Oil Composition from Hydrothermal Liquefaction of Biomass
AU - Subramanya, Seshasayee Mahadevan
AU - Rios, Nicholas
AU - Kollar, Abbey
AU - Stofanak, Rachel
AU - Maloney, Katherine
AU - Waltz, Kayley
AU - Powers, Lucas
AU - Rane, Chinmayee
AU - Savage, Phillip E.
N1 - Publisher Copyright:
© 2023 American Chemical Society.
PY - 2023/5/4
Y1 - 2023/5/4
N2 - We used 352 published data points to develop multivariate linear regression, regression tree, and random forest models that predict the chemical composition of light oil from hydrothermal liquefaction of biomass. The mean absolute error calculated from ten-fold cross-validation indicates the random forest model had the best predictive ability, followed by regression tree and multivariate linear regression models. The random forest method is also more scalable than multivariate linear regression for data points outside the range of the dataset. The decision tree methods yield minimal information for improving understanding of the HTL process chemistry. Multivariate linear regression, on the other hand, identified previously unknown ternary interactions. For example, interactions involving lipid, lignin, and protein increase the abundance of N-containing compounds in the light oil. Further experimentation with lipid, lignin, and protein model compounds showed the formation of large amounts of undesirable long-chain amides in oil. This work shows that using multiple statistical models can further deepen the understanding of the HTL process in addition to providing tools that predict process outcomes.
AB - We used 352 published data points to develop multivariate linear regression, regression tree, and random forest models that predict the chemical composition of light oil from hydrothermal liquefaction of biomass. The mean absolute error calculated from ten-fold cross-validation indicates the random forest model had the best predictive ability, followed by regression tree and multivariate linear regression models. The random forest method is also more scalable than multivariate linear regression for data points outside the range of the dataset. The decision tree methods yield minimal information for improving understanding of the HTL process chemistry. Multivariate linear regression, on the other hand, identified previously unknown ternary interactions. For example, interactions involving lipid, lignin, and protein increase the abundance of N-containing compounds in the light oil. Further experimentation with lipid, lignin, and protein model compounds showed the formation of large amounts of undesirable long-chain amides in oil. This work shows that using multiple statistical models can further deepen the understanding of the HTL process in addition to providing tools that predict process outcomes.
UR - http://www.scopus.com/inward/record.url?scp=85154057762&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85154057762&partnerID=8YFLogxK
U2 - 10.1021/acs.energyfuels.3c00297
DO - 10.1021/acs.energyfuels.3c00297
M3 - Article
AN - SCOPUS:85154057762
SN - 0887-0624
VL - 37
SP - 6619
EP - 6628
JO - Energy and Fuels
JF - Energy and Fuels
IS - 9
ER -