Bohemia - A Validator for Parser Frameworks

Anish Paranjpe, Gang Tan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Parsing is ubiquitous in software projects, ranging from small command-line utilities, highly secure network clients, to large compilers. Programmers are provided with a plethora of parsing libraries to choose from. However, implementation bugs in parsing libraries allow the generation of incorrect parsers, which in turn may allow malicious inputs to crash systems or launch security exploits. In this paper we describe a lightweight validation framework called Bohemia that a parsing library developer can use as a tool in a toolkit for integration testing the framework makes use of the concept of Equivalence Modulo Inputs (EMI) in order to generate mutated input grammars to stress test the parsing library. We also describe the result of evaluating Bohemia with a set of parsing libraries that utilize distinct parsing algorithms. During the evaluation, we found a number of bugs in those libraries. Some of those have been reported to and fixed by developers.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE Symposium on Security and Privacy Workshops, SPW 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages162-170
Number of pages9
ISBN (Electronic)9781728189345
DOIs
StatePublished - May 2021
Event2021 IEEE Symposium on Security and Privacy Workshops, SPW 2021 - Virtual, Online
Duration: May 27 2021 → …

Publication series

NameProceedings - 2021 IEEE Symposium on Security and Privacy Workshops, SPW 2021

Conference

Conference2021 IEEE Symposium on Security and Privacy Workshops, SPW 2021
CityVirtual, Online
Period5/27/21 → …

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Bohemia - A Validator for Parser Frameworks'. Together they form a unique fingerprint.

Cite this