Project Details
Description
Today, machine learning is a prominent scientific computing tool with many practical applications. Notable successes are the classification problems of identifying pictures and the Artificial Intelligence (AI) Go-player that beats the best human player in the world. While these successes are important milestones, there are emerging needs to replicate these successes in the statistical modeling of time-evolving complex systems, with examples ranging from predicting climate to nanomaterials under external disturbances. The goal of this project is to develop the next-generation mathematical and algorithmic tools to overcome two important issues in extending machine learning to such problems, namely a shortage of informative data for effective learning and the expensive computational costs. This objective will be addressed by a theoretical and algorithmic development in computational mathematics, leveraging the fundamental knowledge from the basic sciences, including geometry, dynamical systems, data sciences, and statistics. This project will contribute to the NSF mission of advancing STEM through the training of graduate students and curricular development through the design of courses in the mathematical theory of machine learning. In particular, this project will support one graduate student. The goals of this project are to overcome the shortage of training data and exploit the manifold assumption to avoid the curse of dimension in the statistical modeling of dynamical systems. Beyond uncertainty quantification (UQ) applications, a statistical closure model will be developed to enhance the training of ML-based prediction models when the observed time series is too short for accurate estimation. Specifically, the proposed projects are: 1) To develop a systematic reduced-order statistical closure model. This project extends the recently developed ML-based non-Markovian closure framework for accurate predictions of statistical responses subjected to unseen external forcings, which is important for UQ. 2) To develop a dimensionality reduction technique that respects the geometry of the data under a manifold assumption on the dynamical variables. The approach includes an accurate Radial Basis Function approximation to the Bochner Laplacian from the embedded data. Subsequently, the estimated eigen-vector-fields will be used as a frame to represent the vector fields corresponding to the unresolved dynamics. This model reduction framework provides a computationally cheaper alternative to deep learning. 3) To study the theoretical convergence property of a recently developed algorithm, Bayesian Machine Learning (BML), which uses solutions of a statistically consistent model to enhance the training of the neural network (NN) model in learning non-Markovian dynamics with a short observational time series. This study is motivated by a recent empirical finding that the NN model obtained from the BML training algorithm improves the accuracy of the El Niño prediction by at least two months compared to the same NN architecture trained using the standard stochastic gradient descent algorithm. The ultimate goal of this study is to evaluate and develop a theoretical understanding of the effectiveness of the statistical closure model from Task 1) to enhance Bayesian Machine Learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Status | Active |
---|---|
Effective start/end date | 8/1/22 → 7/31/25 |
Funding
- National Science Foundation: $300,000.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.