Sprite: A fast parallel SNP detection pipeline

Vasudevan Rengasamy, Kamesh Madduri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations


We present Sprite, a new high-performance data analysis pipeline for detecting single nucleotide polymorphisms (SNPs) in the human genome. A SNP detection pipeline for next-generation sequencing data uses several software tools, including tools for read alignment, processing alignment output, and SNP identification. We target end-toend scalability and I/O efficiency in Sprite by merging tools in this pipeline and eliminating redundancies. For a benchmark human wholegenome sequencing data set, Sprite takes less than 50min on 16 nodes of the TACC Stampede supercomputer. A key component of our optimized pipeline is parsnip, a new parallel method and software tool for SNP detection. We find that the quality of results obtained by parsnip (sensitivity and precision using high-confidence variant calls as ground truth) is comparable to state-of-the-art SNP-calling software. A prototype implementation of Sprite is available at sprite-psu.sourceforge.net.

Original languageEnglish (US)
Title of host publicationHigh Performance Computing - 31st International Conference, ISC High Performance 2016, Proceedings
EditorsJack Dongarra, Julian M. Kunkel, Pavan Balaji
PublisherSpringer Verlag
Number of pages19
ISBN (Print)9783319413204
StatePublished - 2016
Event31st International Conference on High Performance Computing, ISC High Performance 2016 - Frankfurt, Germany
Duration: Jun 19 2016Jun 23 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other31st International Conference on High Performance Computing, ISC High Performance 2016

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Sprite: A fast parallel SNP detection pipeline'. Together they form a unique fingerprint.

Cite this