TY - JOUR
T1 - Missing Data in Research on Youth and Family Programs
AU - Ballard, Jaime
AU - Richmond, Adeya
AU - van den Hoogenhof, Suzanne
AU - Borden, Lynne
AU - Perkins, Daniel Francis
N1 - Publisher Copyright:
© The Author(s) 2021.
PY - 2022/10
Y1 - 2022/10
N2 - Background: Multilevel data can be missing at the individual level or at a nested level, such as family, classroom, or program site. Increased knowledge of higher-level missing data is necessary to develop evaluation design and statistical methods to address it. Methods: Participants included 9,514 individuals participating in 47 youth and family programs nationwide who completed multiple self-report measures before and after program participation. Data were marked as missing or not missing at the item, scale, and wave levels for both individuals and program sites. Results: Site-level missing data represented a substantial portion of missing data, ranging from 0–46% of missing data at pre-test and 35–71% of missing data at post-test. Youth were the most likely to be missing data, although site-level data did not differ by the age of participants served. In this dataset youth had the most surveys to complete, so their missing data could be due to survey fatigue. Conclusions: Much of the missing data for individuals can be explained by the site not administering those questions or scales. These results suggest a need for statistical methods that account for site-level missing data, and for research design methods to reduce the prevalence of site-level missing data or reduce its impact. Researchers can generate buy-in with sites during the community collaboration stage, assessing problematic items for revision or removal and need for ongoing site support, particularly at post-test. We recommend that researchers conducting multilevel data report the amount and mechanism of missing data at each level.
AB - Background: Multilevel data can be missing at the individual level or at a nested level, such as family, classroom, or program site. Increased knowledge of higher-level missing data is necessary to develop evaluation design and statistical methods to address it. Methods: Participants included 9,514 individuals participating in 47 youth and family programs nationwide who completed multiple self-report measures before and after program participation. Data were marked as missing or not missing at the item, scale, and wave levels for both individuals and program sites. Results: Site-level missing data represented a substantial portion of missing data, ranging from 0–46% of missing data at pre-test and 35–71% of missing data at post-test. Youth were the most likely to be missing data, although site-level data did not differ by the age of participants served. In this dataset youth had the most surveys to complete, so their missing data could be due to survey fatigue. Conclusions: Much of the missing data for individuals can be explained by the site not administering those questions or scales. These results suggest a need for statistical methods that account for site-level missing data, and for research design methods to reduce the prevalence of site-level missing data or reduce its impact. Researchers can generate buy-in with sites during the community collaboration stage, assessing problematic items for revision or removal and need for ongoing site support, particularly at post-test. We recommend that researchers conducting multilevel data report the amount and mechanism of missing data at each level.
UR - http://www.scopus.com/inward/record.url?scp=85109026020&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85109026020&partnerID=8YFLogxK
U2 - 10.1177/00332941211026851
DO - 10.1177/00332941211026851
M3 - Article
C2 - 34192999
AN - SCOPUS:85109026020
SN - 0033-2941
VL - 125
SP - 2664
EP - 2687
JO - Psychological reports
JF - Psychological reports
IS - 5
ER -