TY - GEN
T1 - ClinicalRisk
T2 - 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023
AU - Luo, Junyu
AU - Qiao, Zhi
AU - Glass, Lucas
AU - Xiao, Cao
AU - Ma, Fenglong
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s).
PY - 2023/10/21
Y1 - 2023/10/21
N2 - Clinical trials aim to study new tests and evaluate their effects on human health outcomes, which has a huge market size. However, carrying out clinical trials is expensive and time-consuming and often ends in no results. It will revolutionize clinical practice if we can develop an effective model to automatically estimate the status of a clinical trial and find out possible failure reasons. However, it is challenging to develop such a model because of the lack of a benchmark dataset. To address these challenges, in this paper, we first build a new dataset by extracting the publicly available clinical trial reports from ClinicalTrials.gov. The associated status of each report is treated as the status label. To analyze the failure reasons, domain experts help us manually annotate each failed report based on the description associated with it. More importantly, we examine several state-of-the-art text classification baselines on this task and find out that the unique format of the clinical trial protocols plays an essential role in affecting prediction accuracy, demonstrating the need for specially designed clinical trial classification models.
AB - Clinical trials aim to study new tests and evaluate their effects on human health outcomes, which has a huge market size. However, carrying out clinical trials is expensive and time-consuming and often ends in no results. It will revolutionize clinical practice if we can develop an effective model to automatically estimate the status of a clinical trial and find out possible failure reasons. However, it is challenging to develop such a model because of the lack of a benchmark dataset. To address these challenges, in this paper, we first build a new dataset by extracting the publicly available clinical trial reports from ClinicalTrials.gov. The associated status of each report is treated as the status label. To analyze the failure reasons, domain experts help us manually annotate each failed report based on the description associated with it. More importantly, we examine several state-of-the-art text classification baselines on this task and find out that the unique format of the clinical trial protocols plays an essential role in affecting prediction accuracy, demonstrating the need for specially designed clinical trial classification models.
UR - http://www.scopus.com/inward/record.url?scp=85178124114&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85178124114&partnerID=8YFLogxK
U2 - 10.1145/3583780.3615113
DO - 10.1145/3583780.3615113
M3 - Conference contribution
C2 - 38601744
AN - SCOPUS:85178124114
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 5356
EP - 5360
BT - CIKM 2023 - Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 21 October 2023 through 25 October 2023
ER -