Transcriptome assembly at single-cell resolution with Beaver

Qian Shi, Qimin Zhang, Mingfu Shao

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation The established single-cell RNA sequencing (scRNA-seq) technologies has revolutionized biological and biomedical research by enabling the measurement of gene expression at single-cell resolution. However, the fundamental challenge of reconstructing full-length transcripts for individual cells remains unresolved. Existing single-sample assembly approaches cannot leverage shared information across cells while meta-assembly approaches often fail to strike a balance between consensus assembly and preserving cell-specific expression signatures. Results We present Beaver, a cell-specific transcript assembler designed for short-read scRNA-seq data. Beaver implements a transcript fragment graph to organize individual assemblies and designs an efficient dynamic programming algorithm that searches for candidate full-length transcripts from the graph. Beaver incorporates two random forest models trained on 51 meticulously engineered features that accurately estimate the likelihood of each candidate transcript being expressed in individual cells. Our experiments, performed using both real and simulated Smart-seq3 scRNA-seq data, firmly show that Beaver substantially outperforms existing meta-assemblers and single-sample assemblers. At the same level of sensitivity, Beaver achieved 32.0%-64.6%, 13.5%-36.6%, and 9.8%-36.3% higher precision in average compared to meta-assemblers Aletsch, TransMeta, and PsiCLASS, respectively, with similar improvements over single-sample assemblers Scallop2 (10.1%-43.6%) and StringTie2 (24.3%-67.0%). Availability and implementation Beaver is freely available at https://github.com/Shao-Group/beaver. Scripts that reproduce the experimental results of this manuscript are available at https://github.com/Shao-Group/beaver-test.

Original languageEnglish (US)
Pages (from-to)i323-i331
JournalBioinformatics
Volume41
Issue numberSupplement_1
DOIs
StatePublished - Jul 1 2025

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Transcriptome assembly at single-cell resolution with Beaver'. Together they form a unique fingerprint.

Cite this