Project Details
Description
Multivariate functional data have become increasingly prevalent with the rise of big data. For example, EEG (Electroencephalography) and fMRI data, GPS tracking data collected by mobile phones, health data collected by smart wearables, some global economic data, and data collected from electric power networks, are all of this type. The key issue in analyzing such data is to discover interrelations among the different random functions describing them. Currently, this is typically done under the statistical assumption of normal (Gaussian) distributions as this leads to computationally simple and highly interpretable estimation procedures in many applications. The Gaussian assumption, however, is quite strong and is not satisfied in many applications. The main goal of this project is to relax the Gaussian assumption while retaining its computational simplicity and high interpretability. This is done by developing a notion of a copula Gaussian model -- a way in which multivariate functional data can be transformed to Gaussian distributed data. Such copula tactics have been extremely versatile in classical low-dimensional settings, combining parsimony aspects of multivariate Gaussian models with the flexibility of nonparametric models. Extending copula-based ideas to multivariate functional data will benefit statistical graphical models, statistical models for causal relations, nonlinear sufficient dimension reduction, and variable selection, bringing fresh insights and research opportunities to a young and dynamic field.
This project proposes a novel functional copula model for non-Gaussian and nonlinear multivariate functional data analysis. The idea is to apply rank and quantile transformations to the coefficients of the Karhunen-Loeve expansions of the random functions. The functional Gaussian copula model greatly simplifies conditional dependence among different random functions in the expansion, as the conditional distribution is completely determined by the covariance operator among random functions. In particular, smoothing over functional spaces is not needed. At the same time, the model does not require Gaussian assumptions, which can be easily violated by functional data. These properties are very useful for constructing graphical models. The functional elliptical distribution copula model is useful for sufficient dimension reduction, because it is a convenient way to meet linearity requirements posed by many commonly used dimension reduction methods. Equipped with this parsimonious but flexible framework, the work plans to develop new methodologies for four areas of multivariate functional data analysis: (i) functional graphical models for undirected graphs; (ii) functional graphical models for causal graphs; (iii) functional sufficient dimension reduction; and (iv) variable selection for function on function regression. The work will study asymptotic properties of these new estimators under both the fixed and high dimensional setting, statistical inference procedures, order determination methods, and efficient algorithms to implement these methods.
Status | Finished |
---|---|
Effective start/end date | 8/15/17 → 7/31/20 |
Funding
- National Science Foundation: $200,000.00