Creation and Analysis of a Corpus of Scam Emails Targeting Universities

Grace Ciambrone, Shomir Wilson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Email-based scams pose a threat to the personally identifiable information and financial safety of all email users. Within a university environment, the risks are potentially greater: traditional students (i.e., within an age range typical of college students) often lack the experience and knowledge of older email users. By understanding the topics, temporal trends, and other patterns of scam emails targeting universities, these institutions can be better equipped to reduce this threat by improving their filtering methods and educating their users. While anecdotal evidence suggests common topics and trends in these scams, the empirical evidence is limited. Observing that large universities are uniquely positioned to gather and share information about email scams, we built a corpus of 5,155 English language scam emails scraped from information security websites of five large universities in the United States. We use Latent Dirichlet Allocation (LDA) topic modelling to assess the landscape and trends of scam emails sent to university addresses. We examine themes chronologically and observe that topics vary over time, indicating changes in scammer strategies. For example, scams targeting students with disabilities have steadily risen in popularity since they first appeared in 2015, while password scams experienced a boom in 2016 but have lessened in recent years. To encourage further research to mitigate the threat of email scams, we release this corpus for others to study.

Original languageEnglish (US)
Title of host publicationACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023
PublisherAssociation for Computing Machinery, Inc
Pages24-27
Number of pages4
ISBN (Electronic)9781450394161
DOIs
StatePublished - Apr 30 2023
Event2023 World Wide Web Conference, WWW 2023 - Austin, United States
Duration: Apr 30 2023May 4 2023

Publication series

NameACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023

Conference

Conference2023 World Wide Web Conference, WWW 2023
Country/TerritoryUnited States
CityAustin
Period4/30/235/4/23

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Creation and Analysis of a Corpus of Scam Emails Targeting Universities'. Together they form a unique fingerprint.

Cite this