Error correction of high-throughput sequencing datasets with non-uniform coverage

Paul Medvedev, Eric Scott, Boyko Kakaradov, Pavel Pevzner

Research output: Contribution to journalArticlepeer-review

91 Scopus citations

Abstract

Motivation: The continuing improvements to high-throughput sequencing (HTS) platforms have begun to unfold a myriad of new applications. As a result, error correction of sequencing reads remains an important problem. Though several tools do an excellent job of correcting datasets where the reads are sampled close to uniformly, the problem of correcting reads coming from drastically non-uniform datasets, such as those from single-cell sequencing, remains open. Results: In this article, we develop the method Hammer for error correction without any uniformity assumptions. Hammer is based on a combination of a Hamming graph and a simple probabilistic model for sequencing errors. It is a simple and adaptable algorithm that improves on other tools on non-uniform single-cell data, while achieving comparable results on normal multi-cell data.

Original languageEnglish (US)
Article numberbtr208
Pages (from-to)i137-i141
JournalBioinformatics
Volume27
Issue number13
DOIs
StatePublished - Jul 2011

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Error correction of high-throughput sequencing datasets with non-uniform coverage'. Together they form a unique fingerprint.

Cite this