DISV: Domain Independent Semantic Validation of Data Files

Ashish Kumar, Bill Harris, Gang Tan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data format specification languages such as PDF or HTML have been used extensively for exchanging structured data over the internet. While receivers of data files (e.g., PDF viewers or web browsers) perform syntax validation of files, validating deep semantic properties has not been systematically explored in practice. However, data files that violate semantic properties may cause unintended effects on receivers, such as causing them to crash or creating security breaches, as recent attacks showed. We present our tool DISV (Domain Independent Semantic Validator). It includes a declarative specification language for users to specify semantic properties of a data format. It also includes a validator that takes a data file together with a property specification and checks if the file follows the specification. We demonstrate a rich variety of properties that can be verified by our tool using eight case studies over three data formats. We also demonstrate that our tool can be used to detect advanced attacks on PDF documents.

Original languageEnglish (US)
Title of host publicationProceeding - 44th IEEE Symposium on Security and Privacy Workshops, SPW 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages163-174
Number of pages12
ISBN (Electronic)9798350312362
DOIs
StatePublished - 2023
Event44th IEEE Symposium on Security and Privacy Workshops, SPW 2023 - San Francisco, United States
Duration: May 22 2023May 25 2023

Publication series

NameProceeding - 44th IEEE Symposium on Security and Privacy Workshops, SPW 2023

Conference

Conference44th IEEE Symposium on Security and Privacy Workshops, SPW 2023
Country/TerritoryUnited States
CitySan Francisco
Period5/22/235/25/23

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Signal Processing
  • Safety, Risk, Reliability and Quality

Cite this