TY - GEN
T1 - Long Reads Enable Accurate Estimates of Complexity of Metagenomes
AU - Bankevich, Anton
AU - Pevzner, Pavel
N1 - Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
PY - 2018
Y1 - 2018
N2 - Although reduced microbiome diversity has been linked to various diseases, estimating the diversity of bacterial communities (the number and the total length of distinct genomes within a metagenome) remains an open problem in microbial ecology. We describe the first analysis of microbial diversity using long reads without any assumption on the frequencies of genomes within a metagenome (parametric methods) and without requiring a large database that covers the total diversity (non-parametric methods). The long read technologies provide new insights into the diversity of metagenomes by interrogating rare species that remained below the radar of previous approaches based on short reads. We present a novel approach for estimating the diversity of metagenomes based on joint analysis of short and long reads and benchmark it on various datasets. We estimate that genomes comprising a human gut metagenome have total length varying from 1.3 to 3.5 billion nucleotides, with genomes responsible for 50 % of total abundance having total length varying from only 40 to 60 million nucleotides. In contrast, genomes comprising an aquifer sediment metagenome have more than two-orders of magnitude larger total length (≈ 840 billion nucleotides).
AB - Although reduced microbiome diversity has been linked to various diseases, estimating the diversity of bacterial communities (the number and the total length of distinct genomes within a metagenome) remains an open problem in microbial ecology. We describe the first analysis of microbial diversity using long reads without any assumption on the frequencies of genomes within a metagenome (parametric methods) and without requiring a large database that covers the total diversity (non-parametric methods). The long read technologies provide new insights into the diversity of metagenomes by interrogating rare species that remained below the radar of previous approaches based on short reads. We present a novel approach for estimating the diversity of metagenomes based on joint analysis of short and long reads and benchmark it on various datasets. We estimate that genomes comprising a human gut metagenome have total length varying from 1.3 to 3.5 billion nucleotides, with genomes responsible for 50 % of total abundance having total length varying from only 40 to 60 million nucleotides. In contrast, genomes comprising an aquifer sediment metagenome have more than two-orders of magnitude larger total length (≈ 840 billion nucleotides).
UR - http://www.scopus.com/inward/record.url?scp=85046138122&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046138122&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-89929-9_1
DO - 10.1007/978-3-319-89929-9_1
M3 - Conference contribution
AN - SCOPUS:85046138122
SN - 9783319899282
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 20
BT - Research in Computational Molecular Biology - 22nd Annual International Conference, RECOMB 2018, Proceedings
A2 - Raphael, Benjamin J.
PB - Springer Verlag
T2 - 22nd International Conference on Research in Computational Molecular Biology, RECOMB 2018
Y2 - 21 April 2018 through 24 April 2018
ER -