Dimension Reduction and Data Visualization for Regression Analysis of Metric-Space-Valued Data

Project: Research project

Project Details


The goal of this project is to systematically develop a set of data exploration and visualization tools for a new type of regression analysis for a form of data that has become increasingly common in recent applications. Such data, known as random objects, do not possess some basic properties of conventional data: for example, they do not have directions or angles that are taken for granted in conventional analysis. Examples include mortality distributions, large-covariance matrices, and observations on spheres. Many existing statistical tools, such as least squares, regression, R-squares, and dimension reduction, cannot be directly applied. A new type of regression, called Fréchet regression, has recently been developed to handle this data type. The current project aims to fill the gap between the new data type and conventional methods by transforming random objects into forms that are accessible by conventional methods with high efficiency. The project will focus on sufficient dimension reduction for the new data type. The results are expected to provide data analysis tools and related computer packages for the new type of regression, to assist preliminary data exploration, data visualization, model diagnostics, and improved estimation accuracy. The project will also involve training and mentoring for graduate students in modern statistical sciences.The project aims to develop flexible and computationally scalable methods for sufficient dimension reduction for a new type of regression where both the predictor and the response can be metric-space-valued random objects. The results are intended to apply in both linear and nonlinear cases. The underlying idea can be used convert existing methods from the multivariate setting to metric-space-valued random elements. The main difficulty in dealing with metric-space-valued random objects is that there are no inner or outer products between observations, which are required by most of the traditional statistical tools, such as covariance matrices, correlation, projection, regression, and ANOVA decomposition. To circumvent this difficulty, the project employs a universal kernel that bridges the gap between metric spaces and Hilbert spaces, which allows construction of an independence structure within the framework of Hilbert spaces though the process of orthogonalization. The bridge provided by the universal kernel is of fundamental importance in metric-space-valued data analysis in general, going far beyond the current setting of sufficient dimension reduction, because a great number of current methods for multivariate and functional data analysis can only be used in the Hilbert space setting.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Effective start/end date8/1/227/31/25


  • National Science Foundation: $290,000.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.