Scholarly digital libraries as a platform for malware distribution

Nir Nissim, Aviad Cohen, Jian Wu, Andrea Lanzi, Lior Rokach, Yuval Elovici, Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations


Researchers from academic institutions and the corporate sector rely heavily on scholarly digital libraries for accessing journal articles and conference proceedings. Primarily downloaded in the form of PDF files, there is a risk that these documents may be compromised by attackers. PDF files have many capabilities that have been widely used for malicious operations. Attackers increasingly take advantage of innocent users who open PDF files with little or no concern, mistakenly considering these files safe and relatively non-threatening. Researchers also consider scholarly digital libraries reliable and home to a trusted corpus of papers and untainted by malicious files. For these reasons, scholarly digital libraries are an attractive target for cyber-attacks launched via PDF files. In this study, we present several vulnerabilities and practical distribution attack approaches tailored for scholarly digital libraries. To support our claim regarding the attractiveness of scholarly digital libraries as an attack platform, we evaluated more than two million scholarly papers in the CiteSeerX library that were collected over 8 years and found it to be contaminated with a surprisingly large number (0.3%-2%) of malicious scholarly PDF documents, the origin of which is 46 different countries spread worldwide. More than 55% of the malicious papers in CiteSeerX were crawled from IP's belonging to USA universities, followed by those belonging to Europe (33.6%). We show how existing scholarly digital libraries can be easily leveraged as a distribution platform both for a targeted attack and in a worldwide manner. On average, a certain malicious paper caused high impact damage as it was downloaded 167 times in 5 years by researchers from different countries worldwide. In general, the USA and Asia downloaded the most malicious scholarly papers, 40.15% and 27.9%, respectively. The top malicious scholarly document downloaded is a malicious version of a popular paper in the computer forensics domain, with 2213 downloads in a worldwide coverage of 108 different countries. Finally, we suggest several concrete solutions for mitigating such attacks, including simple deterministic solutions and also advanced machine learning-based frameworks.

Original languageEnglish (US)
Title of host publicationA Systems Approach to Cyber Security - Proceedings of the 2nd Singapore Cyber-Security R and D Conference, SG-CRC 2017
EditorsYang Liu, Abhik Roychoudhury
PublisherIOS Press
Number of pages22
ISBN (Electronic)9781614997436
StatePublished - 2017
Event2nd Singapore Cyber-Security R and D Conference, SG-CRC 2017 - Singapore, Singapore
Duration: Feb 21 2017Feb 22 2017

Publication series

NameCryptology and Information Security Series
ISSN (Print)1871-6431
ISSN (Electronic)1879-8101


Other2nd Singapore Cyber-Security R and D Conference, SG-CRC 2017

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Scholarly digital libraries as a platform for malware distribution'. Together they form a unique fingerprint.

Cite this