TY - JOUR
T1 - Augmenting the bootstrap to analyze high dimensional genomic data
AU - Tyekucheva, Svitlana
AU - Chiaromonte, Francesca
N1 - Funding Information:
This invited paper is discussed in the comments available at: http://dx.doi.org/10.1007/s11749-008-0099-5, http://dx.doi.org/10.1007/s11749-008-0100-3, http://dx.doi.org/10.1007/s11749-008-0101-2, http://dx.doi.org/10.1007/s11749-008-0102-1, http://dx.doi.org/10.1007/s11749-008-0103-0, http://dx.doi.org/10.1007/s11749-008-0104-z, http://dx.doi.org/10.1007/s11749-008-0105-y, http://dx.doi.org/10.1007/s11749-008-0106-x. This work was partially supported by NIH grant HG02238 to W. Miller, NIH grant R01-GM072264 to K. Makova, and NSF grant DMS-0704621 to R.D. Cook, B. Li and F. Chiaromonte.
PY - 2008/5
Y1 - 2008/5
N2 - The data produced by high-throughput genomic techniques are often high dimensional and undersampled. In these settings, statistical analyses that require the inversion of covariance matrices, such as those pursuing supervised dimension reduction or the assessment of interdependence structures, are problematic. In this article we show how the idea of adding noise to the bootstrap, pioneered by Efron, and Silverman and Young, in the late seventies and eighties, can be used to overcome undersampling and effectively estimate the inverse covariance matrix for data sets in which the number of observations is small relative to the number of variables. We demonstrate the performance of this approach, which we call augmented bootstrap, on simulated data and on data derived from genomic DNA sequences and microarray experiments.
AB - The data produced by high-throughput genomic techniques are often high dimensional and undersampled. In these settings, statistical analyses that require the inversion of covariance matrices, such as those pursuing supervised dimension reduction or the assessment of interdependence structures, are problematic. In this article we show how the idea of adding noise to the bootstrap, pioneered by Efron, and Silverman and Young, in the late seventies and eighties, can be used to overcome undersampling and effectively estimate the inverse covariance matrix for data sets in which the number of observations is small relative to the number of variables. We demonstrate the performance of this approach, which we call augmented bootstrap, on simulated data and on data derived from genomic DNA sequences and microarray experiments.
UR - http://www.scopus.com/inward/record.url?scp=40949165781&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40949165781&partnerID=8YFLogxK
U2 - 10.1007/s11749-008-0098-6
DO - 10.1007/s11749-008-0098-6
M3 - Article
AN - SCOPUS:40949165781
SN - 1133-0686
VL - 17
SP - 1
EP - 18
JO - Test
JF - Test
IS - 1
ER -