TY - JOUR
T1 - Understanding and predicting retractions of published work
AU - Modukuri, Sai Ajay
AU - Rajtmajer, Sarah
AU - Squicciarini, Anna Cinzia
AU - Wu, Jian
AU - Giles, C. Lee
N1 - Publisher Copyright:
Copyright © 2021for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2021
Y1 - 2021
N2 - Recent increases in the number of retractions of published papers reflect heightened attention and increased scrutiny in the scientific process motivated, in part, by the replication crisis. These trends motivate computational tools for understanding and assessment of the scholarly record. Here, we sketch the landscape of retracted papers in the Retraction Watch database, a collection of 19k records of published scholarly articles that have been retracted for various reasons (e.g., plagiarism, data error). Using metadata as well as features derived from full-text for a subset of retracted papers in the social and behavioral sciences, we develop a random forest classifier to predict retraction in new samples with 73% accuracy and F1-score of 71%. We believe this study to be the first of its kind to demonstrate the utility of machine learning as a tool for the assessment of retracted work.
AB - Recent increases in the number of retractions of published papers reflect heightened attention and increased scrutiny in the scientific process motivated, in part, by the replication crisis. These trends motivate computational tools for understanding and assessment of the scholarly record. Here, we sketch the landscape of retracted papers in the Retraction Watch database, a collection of 19k records of published scholarly articles that have been retracted for various reasons (e.g., plagiarism, data error). Using metadata as well as features derived from full-text for a subset of retracted papers in the social and behavioral sciences, we develop a random forest classifier to predict retraction in new samples with 73% accuracy and F1-score of 71%. We believe this study to be the first of its kind to demonstrate the utility of machine learning as a tool for the assessment of retracted work.
UR - http://www.scopus.com/inward/record.url?scp=85103079212&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103079212&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85103079212
SN - 1613-0073
VL - 2831
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2021 Workshop on Scientific Document Understanding, SDU 2021
Y2 - 9 February 2021
ER -