Assembling genomes and mini-metagenomes from highly chimeric reads

Sergey Nurk, Anton Bankevich, Dmitry Antipov, Alexey Gurevich, Anton Korobeynikov, Alla Lapidus, Andrey Prjibelsky, Alexey Pyshkin, Alexander Sirotkin, Yakov Sirotkin, Ramunas Stepanauskas, Jeffrey McLean, Roger Lasken, Scott R. Clingenpeel, Tanja Woyke, Glenn Tesler, Max A. Alekseyev, Pavel A. Pevzner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

526 Scopus citations

Abstract

Recent advances in single-cell genomics provide an alternative to gene-centric metagenomics studies, enabling whole genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly non-uniform read coverage, and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing "dark matter of life" that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. We demonstrate that SPAdes enables sequencing mini-metagenomes and benchmark it against various assemblers. On single-cell bacterial datasets, SPAdes improves on the recently developed E+V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (multicell) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet.

Original languageEnglish (US)
Title of host publicationResearch in Computational Molecular Biology - 17th Annual International Conference, RECOMB 2013, Proceedings
Pages158-170
Number of pages13
DOIs
StatePublished - 2013
Event17th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2013 - Beijing, China
Duration: Apr 7 2013Apr 10 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7821 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2013
Country/TerritoryChina
CityBeijing
Period4/7/134/10/13

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Assembling genomes and mini-metagenomes from highly chimeric reads'. Together they form a unique fingerprint.

Cite this