Which Test Is Best: Evaluation of Traditional and Contemporary Statistical Tests for Analysis of Spherical Equivalent Prediction Error

Nathan T. Cannon, Giacomo Savini, Seth M. Pantanelli, Kenneth Hoffer, Petros Aristodemou, Kamran Riaz, David Murphy, David Griffin, Christian Berry, Guillaume Debellemanière, Mathieu Gauvin, Avi Wallerstein, Woong Joo Whang, Kyungmin Koh, Kazuno Negishi, Ken Hayashi, Diogo Hipólito-Fernandes, David L. Cooke

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Purpose: To characterize the performance of traditional and contemporary statistics tests for analysis of spherical equivalent prediction error (SEQ-PE) after cataract surgery, with regard to test significance and self-consistency. Design: Comparison of the utility of statistical tests. Methods: Subjects: Eyes from 5 academic centers and 2 private practices that had cataract surgery and postoperative manifest refraction between March 2011 and December 2022. SEQ-PE data were randomly divided into subsets with sample sizes of 100, 300, 500, 700, and 2600 eyes. Mean absolute error (MAE), median absolute error (MedAE), SD, root mean squared absolute error (RMSAE), and the proportion of eyes within 0.50 diopters (D) of predicted were calculated for 6 power prediction formulas and analyzed using Friedman post hoc Dunn, Cochran Q post hoc McNemar, Eyetemis, and Wilcox-Holladay-Wang-Koch (WHWK) statistical tests. All tests were corrected for multiple comparisons using the Holm correction. Main outcome measures: The percentage of significant relationships (Percent Significance), proportion of inconsistencies (Inconsistency Ratio), and proportion of self-consistent significant relationships (Significance Index) for each statistical test. Results: Analysis was performed on 7839 eyes of 7839 patients. WHWK.MAE (42%), WHWK.SD (41%), Eyetemis.MAE (40%), WHWK.RMSAE (39%), and Dunn.MAE (34%) were more robust, respectively, than the remaining 3 tests by Percent Significance (all P <.001). Dunn.MAE had the best Inconsistency Ratio (0.11) in the 100-eye subsets. The same top 5 tests were most robust by Significance Index (0.39, 0.35, 0.35, 0.34, and 0.31, respectively; all P <.02). WHWK.SD and WHWK.RMSAE had the best Significance Indices (both 0.77) in the 2600-eye subsets. McNemar had the poorest Significance Index overall (0.09). Conclusions: The 5 high-performing tests produced significant results more often and were also self-consistent. WHWK.MAE and McNemar were highest and lowest performing overall, respectively. Dunn.MAE may be useful in sample sizes <150 eyes.

Original languageEnglish (US)
Pages (from-to)33-42
Number of pages10
JournalAmerican Journal of Ophthalmology
Volume273
DOIs
StatePublished - May 2025

All Science Journal Classification (ASJC) codes

  • Ophthalmology

Fingerprint

Dive into the research topics of 'Which Test Is Best: Evaluation of Traditional and Contemporary Statistical Tests for Analysis of Spherical Equivalent Prediction Error'. Together they form a unique fingerprint.

Cite this