TY - JOUR
T1 - A look at multiplicity through misclassification
AU - Dasgupta, Nairanjana
AU - Genz, Alan
AU - Lazar, Nicole A.
N1 - Publisher Copyright:
© 2015, Indian Statistical Institute.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2016
Y1 - 2016
N2 - Multiplicity in large scale studies using, for example, microarray genomic data and functional neuroimaging data, has been an extensively researched topic in recent years. One option often used by researchers in practice is a “top r-table”, which involves ranking the hypotheses in some order (pvalues or test statistics) and reporting the top r results. This has immediate practical applications as what we have is a list of “interesting” results that are worth following up, irrespective of the actual p-value (adjusted or not). In this manuscript we take another look at multiplicity using top-tables. Our approach is intended to be a compromise between theory and practice. We look at the relationship between the probability of correct classification, which we call r-power (the units picked in the top-r table do indeed come from the alternative), and the value of r. We analytically define r-power in terms of order statistics and quantify the probability of correct classification. We use numerical integration to calculate r-power as a function of effect size, δ; the number of hypotheses tested, N; the number of hypotheses coming from the null, k; and r. Our results indicate that r-power is positively related to effect size, and negatively related to k/N. The relationship to r depends upon whether r < k. There are two possible uses of our results: based on a pre-chosen r-power we can calculate r and decide on the number of hypotheses to be followed up or if r is calculated using some other criterion we can use our method to calculate r-power in that context. We illustrate these ideas using examples from microarrays and functional magnetic resonance imaging data.
AB - Multiplicity in large scale studies using, for example, microarray genomic data and functional neuroimaging data, has been an extensively researched topic in recent years. One option often used by researchers in practice is a “top r-table”, which involves ranking the hypotheses in some order (pvalues or test statistics) and reporting the top r results. This has immediate practical applications as what we have is a list of “interesting” results that are worth following up, irrespective of the actual p-value (adjusted or not). In this manuscript we take another look at multiplicity using top-tables. Our approach is intended to be a compromise between theory and practice. We look at the relationship between the probability of correct classification, which we call r-power (the units picked in the top-r table do indeed come from the alternative), and the value of r. We analytically define r-power in terms of order statistics and quantify the probability of correct classification. We use numerical integration to calculate r-power as a function of effect size, δ; the number of hypotheses tested, N; the number of hypotheses coming from the null, k; and r. Our results indicate that r-power is positively related to effect size, and negatively related to k/N. The relationship to r depends upon whether r < k. There are two possible uses of our results: based on a pre-chosen r-power we can calculate r and decide on the number of hypotheses to be followed up or if r is calculated using some other criterion we can use our method to calculate r-power in that context. We illustrate these ideas using examples from microarrays and functional magnetic resonance imaging data.
UR - http://www.scopus.com/inward/record.url?scp=84986300584&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84986300584&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:84986300584
SN - 0972-7671
VL - 78B
SP - 96
EP - 118
JO - Sankhya: The Indian Journal of Statistics
JF - Sankhya: The Indian Journal of Statistics
ER -