TY - GEN
T1 - CHECKER
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2021
AU - Xie, Tianyi
AU - Le, Thai
AU - Lee, Dongwon
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Clickbait thumbnails on video-sharing platforms (e.g., YouTube, Dailymotion) are small catchy images that are designed to entice users to click to view the linked videos. Despite their usefulness, the landing videos after click are often inconsistent with what the thumbnails have advertised, causing poor user experience and undermining the reputation of the platforms. In this work, therefore, we aim to develop a computational solution, named as CHECKER, to detect clickbait thumbnails with high accuracy. Due to the fuzziness in the definition of clickbait thumbnails and subsequent challenges in creating high-quality labeled samples, the industry has not coped with clickbait thumbnails adequately. To address this challenge, CHECKER shares a novel clickbait thumbnail dataset and codebase with the industry, and exploits: (1) the weak supervision framework to generate many noisy-but-useful labels, and (2) the co-teaching framework to learn robustly using such noisy labels. Moreover, we also investigate how to detect clickbaits on video-sharing platforms with both thumbnails and titles, and exploit recent advances in vision-language models. In the empirical validation, CHECKER outperforms five baselines by at least 6.4% in F1-score and 4.2% in AUC-ROC. The codebase and dataset from our paper are available at: https://github.com/XPandora/CHECKER.
AB - Clickbait thumbnails on video-sharing platforms (e.g., YouTube, Dailymotion) are small catchy images that are designed to entice users to click to view the linked videos. Despite their usefulness, the landing videos after click are often inconsistent with what the thumbnails have advertised, causing poor user experience and undermining the reputation of the platforms. In this work, therefore, we aim to develop a computational solution, named as CHECKER, to detect clickbait thumbnails with high accuracy. Due to the fuzziness in the definition of clickbait thumbnails and subsequent challenges in creating high-quality labeled samples, the industry has not coped with clickbait thumbnails adequately. To address this challenge, CHECKER shares a novel clickbait thumbnail dataset and codebase with the industry, and exploits: (1) the weak supervision framework to generate many noisy-but-useful labels, and (2) the co-teaching framework to learn robustly using such noisy labels. Moreover, we also investigate how to detect clickbaits on video-sharing platforms with both thumbnails and titles, and exploit recent advances in vision-language models. In the empirical validation, CHECKER outperforms five baselines by at least 6.4% in F1-score and 4.2% in AUC-ROC. The codebase and dataset from our paper are available at: https://github.com/XPandora/CHECKER.
UR - http://www.scopus.com/inward/record.url?scp=85115711160&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115711160&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-86517-7_26
DO - 10.1007/978-3-030-86517-7_26
M3 - Conference contribution
AN - SCOPUS:85115711160
SN - 9783030865160
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 415
EP - 430
BT - Machine Learning and Knowledge Discovery in Databases
A2 - Dong, Yuxiao
A2 - Kourtellis, Nicolas
A2 - Hammer, Barbara
A2 - Lozano, Jose A.
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 13 September 2021 through 17 September 2021
ER -