Skip to main navigation Skip to search Skip to main content

Explicit Scale Simulation for analysis of RNA-sequencing count data with ALDEx2

Research output: Contribution to journalArticlepeer-review

Abstract

In high-throughput sequencing (HTS) studies, sample-to-sample variation in sequencing depth is driven by technical factors, and not by variation in the scale (size) of the biological system. Typically a statistical normalization removes unwanted technical variation in the data or the parameters of the model to enable differential abundance analyses. We recently showed that all normalizations make implicit assumptions about the unmeasured system scale and that errors in these assumptions can dramatically increase false positive and false negative rates. We demonstrated that these errors can be mitigated by accounting for uncertainty using a scale model, which we integrated into the ALDEx2 R package. This article provides new insights focusing on the application to transcriptomic analysis. We provide transcriptomic case studies demonstrating how scale models, rather than traditional normalizations, can reduce false positive and false negative rates in practice while enhancing the transparency and reproducibility of analyses. These scale models replace the need for dual cutoff approaches often used to address the disconnect between practical and statistical significance. We demonstrate the utility of scale models built based on known housekeeping genes in complex metatranscriptomic datasets. Thus this work provides guidance on how to incorporate scale into transcriptomic data sets.

Original languageEnglish (US)
Article numberlqaf108
JournalNAR Genomics and Bioinformatics
Volume7
Issue number3
DOIs
StatePublished - Sep 1 2025

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Molecular Biology
  • Genetics
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Explicit Scale Simulation for analysis of RNA-sequencing count data with ALDEx2'. Together they form a unique fingerprint.

Cite this