TY - JOUR
T1 - Efficient exploration of reaction pathways using reaction databases and active learning
AU - Kuryla, Domantas
AU - Csányi, Gábor
AU - van Duin, Adri C.T.
AU - Michaelides, Angelos
N1 - Publisher Copyright:
© 2025 Author(s).
PY - 2025/3/21
Y1 - 2025/3/21
N2 - The fast and accurate simulation of chemical reactions is a major goal of computational chemistry. Recently, the pursuit of this goal has been aided by machine learning interatomic potentials (MLIPs), which provide energies and forces at quantum mechanical accuracy but at a fraction of the cost of the reference quantum mechanical calculations. Assembling the training set of relevant configurations is key to building the MLIP. Here, we demonstrate two approaches to training reactive MLIPs based on reaction pathway information. One approach exploits reaction datasets containing reactant, product, and transition state structures. Using an SN2 reaction dataset, we accurately locate reaction pathways and transition state geometries of up to 170 unseen reactions. In another approach, which does not depend on data availability, we present an efficient active learning procedure that yields an accurate MLIP and converged minimum energy path given only the reaction end point structures, avoiding quantum mechanics driven reaction pathway search at any stage of training set construction. We demonstrate this procedure on an SN2 reaction in the gas phase and with a small number of solvating water molecules, predicting reaction barriers within 20 meV of the reference quantum chemistry method. We then apply the active learning procedure on a more complex reaction involving a nucleophilic aromatic substitution and proton transfer, comparing the results against the reactive ReaxFF force field. Our active learning procedure, in addition to rapidly finding reaction paths for individual reactions, provides an approach to building large reaction path databases for training transferable reactive machine learning potentials.
AB - The fast and accurate simulation of chemical reactions is a major goal of computational chemistry. Recently, the pursuit of this goal has been aided by machine learning interatomic potentials (MLIPs), which provide energies and forces at quantum mechanical accuracy but at a fraction of the cost of the reference quantum mechanical calculations. Assembling the training set of relevant configurations is key to building the MLIP. Here, we demonstrate two approaches to training reactive MLIPs based on reaction pathway information. One approach exploits reaction datasets containing reactant, product, and transition state structures. Using an SN2 reaction dataset, we accurately locate reaction pathways and transition state geometries of up to 170 unseen reactions. In another approach, which does not depend on data availability, we present an efficient active learning procedure that yields an accurate MLIP and converged minimum energy path given only the reaction end point structures, avoiding quantum mechanics driven reaction pathway search at any stage of training set construction. We demonstrate this procedure on an SN2 reaction in the gas phase and with a small number of solvating water molecules, predicting reaction barriers within 20 meV of the reference quantum chemistry method. We then apply the active learning procedure on a more complex reaction involving a nucleophilic aromatic substitution and proton transfer, comparing the results against the reactive ReaxFF force field. Our active learning procedure, in addition to rapidly finding reaction paths for individual reactions, provides an approach to building large reaction path databases for training transferable reactive machine learning potentials.
UR - http://www.scopus.com/inward/record.url?scp=105000546508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105000546508&partnerID=8YFLogxK
U2 - 10.1063/5.0235715
DO - 10.1063/5.0235715
M3 - Article
C2 - 40116310
AN - SCOPUS:105000546508
SN - 0021-9606
VL - 162
JO - Journal of Chemical Physics
JF - Journal of Chemical Physics
IS - 11
M1 - 114122
ER -