Bias Analysis for Misclassification Errors in both the Response Variable and Covariate

Juxin Liu, Annshirley Afful, Holly Mansell, Yanyuan Ma

Research output: Contribution to journalArticlepeer-review


Abstract–Much literature has focused on statistical inference for misclassified response variables or misclassified covariates. However, misclassification in both the response variable and the covariate has received very limited attention within applied fields and the statistics community. In situations where the response variable and the covariate are simultaneously subject to misclassification errors, an assumption of independent misclassification errors is often used for convenience without justification. This article aims to show the harmful consequences of inappropriate adjustment for joint misclassification errors. In particular, we focus on the wrong adjustment by ignoring the dependence between the misclassification process of the response variable and the covariate. In this article, the dependence of misclassification in both variables is characterized by covariance-type parameters. We extend the original definition of dependence parameters to a more general setting. We discover a single quantity that governs the dependence of the two misclassification processes. Moreover, we propose likelihood ratio tests to check the nondifferential/independent misclassification assumption in main study/internal validation study designs. Our simulation studies indicate that ignoring the dependent error structure can be even worse than ignoring all the misclassification errors when the validation data size is relatively small. The methodology is illustrated by a real data example.

Original languageEnglish (US)
Pages (from-to)353-362
Number of pages10
JournalAmerican Statistician
Issue number4
StatePublished - 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • General Mathematics
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Bias Analysis for Misclassification Errors in both the Response Variable and Covariate'. Together they form a unique fingerprint.

Cite this