A unified approach for outliers and influential data detection: The value of information in retrospect

Jacob Parsons, Le Bao

Research output: Contribution to journalArticlepeer-review

Abstract

Identifying influential and outlying data is important as it would guide the effective collection of future data and the proper use of existing information. We develop a unified approach for outlier detection and influence analysis. Our proposed method is grounded in the intuitive value of information concepts and has a distinct advantage in interpretability and flexibility when compared to existing methods: It decomposes the data influence into the leverage effect (expected to be influential) and the outlying effect (surprisingly more influential than being expected); and it applies to all decision problems such as estimation, prediction and hypothesis testing. We study the theoretical properties of three values of information quantities, establish the relationship between the proposed measures and classic measures in the linear regression setting and provide real data analysis examples of how to apply the new value of information approach in the cases of linear regression, generalized linear mixed models and hypothesis testing.

Original languageEnglish (US)
Article numbere442
JournalStat
Volume11
Issue number1
DOIs
StatePublished - Dec 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'A unified approach for outliers and influential data detection: The value of information in retrospect'. Together they form a unique fingerprint.

Cite this