Variable Selection in High-Dimensional Modeling and Its Oracle Properties

Project: Research project

Project Details

Description

High-dimensional data, such as, biotech and genetic data, financial data, satellite imagery and hyper-spectral imagery, are commonplace in our daily life. Indeed, high-dimensional data analysis has become an important research topic in statistics. Variable selection is fundamental to high-dimensional statistical modeling. Many approaches currently in use are stepwise selection procedures, which are expensive in computation and ignore stochastic errors in the stage of selection process. This research involves a variety of data-analytic techniques for developing a unified effective variable selection procedure in high-dimensional statistical modeling. The goal of this project is to significantly enhance the availability of tools for analyzing complicated high-dimensional data.

In this project, penalized least squares and a penalized likelihood approach are proposed to select significant variables for various models used in high-dimensional data analysis. The proposed approach is distinguished from others since it deletes insignificant covariates by estimating their coefficients to be zero. In the other words, it simultaneously selects significant variables and estimates their regression coefficients, and thereby enables one to construct confidence intervals for the estimated parameters. An algorithm is proposed for finding solutions to optimization problems involved in the penalized least squares and penalized likelihood. The rates of convergence and the sampling properties of the resulting estimators are investigated and presented.

StatusFinished
Effective start/end date7/1/015/31/05

Funding

  • National Science Foundation: $96,769.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.