Abstract
Estimation of enzymatic activities still heavily relies on experimental assays, which can be cost and time-intensive. We present CatPred, a deep learning framework for predicting in vitro enzyme kinetic parameters, including turnover numbers (kcat), Michaelis constants (Km), and inhibition constants (Ki). CatPred addresses key challenges such as the lack of standardized datasets, performance evaluation on enzyme sequences that are dissimilar to those used during training, and model uncertainty quantification. We explore diverse learning architectures and feature representations, including pretrained protein language models and three-dimensional structural features, to enable robust predictions. CatPred provides accurate predictions with query-specific uncertainty estimates, with lower predicted variances correlating with higher accuracy. Pretrained protein language model features particularly enhance performance on out-of-distribution samples. CatPred also introduces benchmark datasets with extensive coverage (~23 k, 41 k, and 12 k data points for kcat, Km, and Ki respectively). Our framework performs competitively with existing methods while offering reliable uncertainty quantification.
| Original language | English (US) |
|---|---|
| Article number | 2072 |
| Journal | Nature communications |
| Volume | 16 |
| Issue number | 1 |
| DOIs | |
| State | Published - Dec 2025 |
All Science Journal Classification (ASJC) codes
- General Chemistry
- General Biochemistry, Genetics and Molecular Biology
- General Physics and Astronomy