Transforming error-prone immunosequencing datasets into Ab repertoires is a fundamental problem in immunogenomics, and a prerequisite for studies of immune responses. Although various repertoire reconstruction algorithms were released in the last 3 y, it remains unclear how to benchmark them and how to assess the accuracy of the reconstructed repertoires. We describe an accurate IgReC algorithm for constructing Ab repertoires from high-throughput immunosequencing datasets and a new framework for assessing the quality of reconstructed repertoires. Surprisingly, Ab repertoires constructed by IgReC from barcoded immunosequencing datasets in the blind mode (without using information about unique molecular identifiers) improved upon the repertoires constructed by the state-of-the-art tools that use barcoding. This finding suggests that IgReC may alleviate the need to generate repertoires using the barcoding technology (the workhorse of current immunogenomics efforts) because our computational approach to error correction of immunosequencing data is nearly as powerful as the experimental approach based on barcoding.
All Science Journal Classification (ASJC) codes
- Immunology and Allergy