Statistical Models for Predicting Oil Composition from Hydrothermal Liquefaction of Biomass

Seshasayee Mahadevan Subramanya, Nicholas Rios, Abbey Kollar, Rachel Stofanak, Katherine Maloney, Kayley Waltz, Lucas Powers, Chinmayee Rane, Phillip E. Savage

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


We used 352 published data points to develop multivariate linear regression, regression tree, and random forest models that predict the chemical composition of light oil from hydrothermal liquefaction of biomass. The mean absolute error calculated from ten-fold cross-validation indicates the random forest model had the best predictive ability, followed by regression tree and multivariate linear regression models. The random forest method is also more scalable than multivariate linear regression for data points outside the range of the dataset. The decision tree methods yield minimal information for improving understanding of the HTL process chemistry. Multivariate linear regression, on the other hand, identified previously unknown ternary interactions. For example, interactions involving lipid, lignin, and protein increase the abundance of N-containing compounds in the light oil. Further experimentation with lipid, lignin, and protein model compounds showed the formation of large amounts of undesirable long-chain amides in oil. This work shows that using multiple statistical models can further deepen the understanding of the HTL process in addition to providing tools that predict process outcomes.

Original languageEnglish (US)
Pages (from-to)6619-6628
Number of pages10
JournalEnergy and Fuels
Issue number9
StatePublished - May 4 2023

All Science Journal Classification (ASJC) codes

  • General Chemical Engineering
  • Fuel Technology
  • Energy Engineering and Power Technology

Cite this