TY - JOUR
T1 - Handling of dioxin measurement data in the presence of non-detectable values
T2 - Overview of available methods and their application in the Seveso chloracne study
AU - Baccarelli, Andrea
AU - Pfeiffer, Ruth
AU - Consonni, Dario
AU - Pesatori, Angela C.
AU - Bonzini, Matteo
AU - Patterson, Donald G.
AU - Bertazzi, Pier Alberto
AU - Landi, Maria Teresa
PY - 2005/8
Y1 - 2005/8
N2 - Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/√2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.
AB - Exposure measurements of concentrations that are non-detectable or near the detection limit (DL) are common in environmental research. Proper statistical treatment of non-detects is critical to avoid bias and unnecessary loss of information. In the present work, we present an overview of possible statistical strategies for handling non-detectable values, including deletion, simple substitution, distributional methods, and distribution-based imputation. Simple substitution methods (e.g., substituting 0, DL/2, DL/√2, or DL for the non-detects) are the most commonly applied, even though the EPA Guidance for Data Quality Assessment discouraged their use when the percentage of non-detects is >15%. Distribution-based multiple imputation methods, also known as robust or "fill-in" procedures, may produce dependable results even when 50-70% of the observations are non-detects and can be performed using commonly available statistical software. Any statistical analysis can be conducted on the imputed datasets. Results properly reflect the presence of non-detectable values and produce valid statistical inference. We describe the use of distribution-based multiple imputation in a recent investigation conducted on subjects from the Seveso population exposed to 2,3,7,8- tetrachlorodibenzo-p-dioxin (TCDD), in which 55.6% of plasma TCDD measurements were non-detects. We suggest that distribution-based multiple imputation be the preferred method to analyze environmental data when substantial proportions of observations are non-detects.
UR - http://www.scopus.com/inward/record.url?scp=21244465964&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=21244465964&partnerID=8YFLogxK
U2 - 10.1016/j.chemosphere.2005.01.055
DO - 10.1016/j.chemosphere.2005.01.055
M3 - Article
C2 - 15992596
AN - SCOPUS:21244465964
SN - 0045-6535
VL - 60
SP - 898
EP - 906
JO - Chemosphere
JF - Chemosphere
IS - 7
ER -