Inter-rater reliability in child sexual abuse diagnosis among expert reviewers

Suzanne P. Starling, Lori D. Frasier, Kristi Jarvis, Anne McDonald

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Objectives: To determine how well experts agree when assessing child sexual abuse cases. Methods: A total of twelve physician subjects were recruited and voluntarily enrolled from an existing peer review network. Experts from the network had been chosen for their experience in the field and their affiliation with children's advocacy centers. Each expert submitted three cases of prepubertal female genital examinations clearly demonstrable of the case findings. Submitted cases included demographics, history, physical and genital exam findings, photodocumentation, and diagnosis. Experts reviewed each submitted case and labeled the case negative for physical finding(s), positive for physical finding(s), or indeterminate. Cases were analyzed to determine the level of agreement. Results: Thirty-six cases were submitted for use in this study; one case was excluded prior to starting the review process. After all experts completed their reviews the authors reviewed the cases and results. Two additional cases were excluded, one due to poor quality photodocumentation and one for not meeting the study criteria. Thirty-three cases were used for data analysis.All 12 expert reviewers agreed in 15 of the cases. Overall, in 22 of 33 (67%) cases at least 11 of the 12 reviewers agreed with the original diagnosis. Six of 33 (18%) cases had variable agreement (8-10 reviewers agreed with original diagnosis) among reviewers; 5 of 33 (15%) cases had poor or mixed agreement (7 or less reviewers agreed with original diagnosis). Conclusions: Experts exhibit consensus in cases where the findings clearly are normal and abnormal, but demonstrate much more variability in cases where the diagnostic decisions are less obvious. Most of the diagnostic variability is due to interpretation of the findings as normal, abnormal or indeterminate, not on the perception of the examination findings themselves. More research should be done to develop a national consensus on the accurate interpretation of anogenital examination findings. Photographic image quality plays an important role in this quality review process and universally needs to be improved.

Original languageEnglish (US)
Pages (from-to)456-464
Number of pages9
JournalChild Abuse and Neglect
Issue number7
StatePublished - Jul 2013

All Science Journal Classification (ASJC) codes

  • Pediatrics, Perinatology, and Child Health
  • Developmental and Educational Psychology
  • Psychiatry and Mental health


Dive into the research topics of 'Inter-rater reliability in child sexual abuse diagnosis among expert reviewers'. Together they form a unique fingerprint.

Cite this