Abstract
The appearance of massive data has become increasingly common in contemporary scientific research. When the sample size n is huge, classical learning methods become computationally costly in regression analysis. Recently, the orthogonal greedy algorithm (OGA) has been revitalized as an efficient alternative in the context of kernel-based statistical learning. In a learning problem, accurate and fast prediction is often of interest. This makes an appropriate termination crucial for OGA. In this paper, we propose a new termination rule for OGA via investigating its predictive performance. The proposed rule is conceptually simple and convenient for implementation, which suggests an O(√n= log n) number of essential updates in an OGA process. It therefore provides an appealing route to conduct efficient learning for massive data. With a sample dependent kernel dictionary, we show that the proposed method is strongly consistent with an O(√log n=n) convergence rate to the oracle prediction. The promising performance of the method is supported by simulation and data examples.
Original language | English (US) |
---|---|
Pages (from-to) | 841-860 |
Number of pages | 20 |
Journal | Statistica Sinica |
Volume | 26 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2016 |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty