Model selection for high-dimensional problems

Jing Zhi Huang, Zhan Shi, Wei Zhong

Research output: Chapter in Book/Report/Conference proceedingChapter

2 Scopus citations


High-dimensional data analysis is becoming more and more important to both academics and practitioners in finance and economics but is also very challenging because the number of variables or parameters in connection with such data can be larger than the sample size. Recently, several variable selection approaches have been developed and used to help us select significant variables and construct a parsimonious model simultaneously. In this chapter, we first provide an overview of model selection approaches in the context of penalized least squares. We then review independence screening, a recently developed method for analyzing ultrahigh-dimensional data where the number of variables or parameters can be exponentially larger than the sample size. Finally, we discuss and advocate multistage procedures that combine independence screening and variable selection and that may be especially suitable for analyzing high-frequency financial data. Penalized least squares seek to keep important predictors in a model while penalizing coefficients associated with irrelevant predictors. As such, under certain conditions, penalized least squares can lead to a sparse solution for linear models and achieve asymptotic consistency in separating relevant variables from irrelevant ones. Independence screening selects relevant variables based on certain measures of marginal correlations between candidate variables and the response.

Original languageEnglish (US)
Title of host publicationHandbook of Financial Econometrics and Statistics
PublisherSpringer New York
Number of pages26
ISBN (Electronic)9781461477501
ISBN (Print)9781461477495
StatePublished - Jan 1 2015

All Science Journal Classification (ASJC) codes

  • General Economics, Econometrics and Finance
  • General Business, Management and Accounting
  • General Mathematics


Dive into the research topics of 'Model selection for high-dimensional problems'. Together they form a unique fingerprint.

Cite this