TY - GEN
T1 - Sprite
T2 - 31st International Conference on High Performance Computing, ISC High Performance 2016
AU - Rengasamy, Vasudevan
AU - Madduri, Kamesh
N1 - Funding Information:
This research is supported by the National Science Foundation award # 1439057. We thank members of our project research team for helpful discussions.
Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016
Y1 - 2016
N2 - We present Sprite, a new high-performance data analysis pipeline for detecting single nucleotide polymorphisms (SNPs) in the human genome. A SNP detection pipeline for next-generation sequencing data uses several software tools, including tools for read alignment, processing alignment output, and SNP identification. We target end-toend scalability and I/O efficiency in Sprite by merging tools in this pipeline and eliminating redundancies. For a benchmark human wholegenome sequencing data set, Sprite takes less than 50min on 16 nodes of the TACC Stampede supercomputer. A key component of our optimized pipeline is parsnip, a new parallel method and software tool for SNP detection. We find that the quality of results obtained by parsnip (sensitivity and precision using high-confidence variant calls as ground truth) is comparable to state-of-the-art SNP-calling software. A prototype implementation of Sprite is available at sprite-psu.sourceforge.net.
AB - We present Sprite, a new high-performance data analysis pipeline for detecting single nucleotide polymorphisms (SNPs) in the human genome. A SNP detection pipeline for next-generation sequencing data uses several software tools, including tools for read alignment, processing alignment output, and SNP identification. We target end-toend scalability and I/O efficiency in Sprite by merging tools in this pipeline and eliminating redundancies. For a benchmark human wholegenome sequencing data set, Sprite takes less than 50min on 16 nodes of the TACC Stampede supercomputer. A key component of our optimized pipeline is parsnip, a new parallel method and software tool for SNP detection. We find that the quality of results obtained by parsnip (sensitivity and precision using high-confidence variant calls as ground truth) is comparable to state-of-the-art SNP-calling software. A prototype implementation of Sprite is available at sprite-psu.sourceforge.net.
UR - http://www.scopus.com/inward/record.url?scp=84977477693&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84977477693&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-41321-1_9
DO - 10.1007/978-3-319-41321-1_9
M3 - Conference contribution
AN - SCOPUS:84977477693
SN - 9783319413204
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 159
EP - 177
BT - High Performance Computing - 31st International Conference, ISC High Performance 2016, Proceedings
A2 - Dongarra, Jack
A2 - Kunkel, Julian M.
A2 - Balaji, Pavan
PB - Springer Verlag
Y2 - 19 June 2016 through 23 June 2016
ER -