High-dimensional data analysis is becoming more and more important to both academics and practitioners in finance and economics but is also very challenging because the number of variables or parameters in connection with such data can be larger than the sample size. Recently, several variable selection approaches have been developed and used to help us select significant variables and construct a parsimonious model simultaneously. In this chapter, we first provide an overview of model selection approaches in the context of penalized least squares. We then review independence screening, a recently developed method for analyzing ultrahigh-dimensional data where the number of variables or parameters can be exponentially larger than the sample size. Finally, we discuss and advocate multistage procedures that combine independence screening and variable selection and that may be especially suitable for analyzing high-frequency financial data. Penalized least squares seek to keep important predictors in a model while penalizing coefficients associated with irrelevant predictors. As such, under certain conditions, penalized least squares can lead to a sparse solution for linear models and achieve asymptotic consistency in separating relevant variables from irrelevant ones. Independence screening selects relevant variables based on certain measures of marginal correlations between candidate variables and the response.
All Science Journal Classification (ASJC) codes
- Economics, Econometrics and Finance(all)
- Business, Management and Accounting(all)