Abstract
We consider the problem of density estimation when the data is in the form of a continuous stream with no fixed length. In this setting, implementations of the usual methods of density estimation such as kernel density estimation are problematic. We propose a method of density estimation for massive datasets that is based upon taking the derivative of a smooth curve that has been fit through a set of quantile estimates. To achieve this, a low-storage, single-pass, sequential method is proposed for simultaneous estimation of multiple quantiles for massive datasets that form the basis of this method of density estimation. For comparison, we also consider a sequential kernel density estimator. The proposed methods are shown through simulation study to perform well and to have several distinct advantages over existing methods.
Original language | English (US) |
---|---|
Pages (from-to) | 311-321 |
Number of pages | 11 |
Journal | Statistics and Computing |
Volume | 17 |
Issue number | 4 |
DOIs | |
State | Published - Dec 1 2007 |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Computational Theory and Mathematics