A bi-Poisson model for clustering gene expression profiles by RNA-seq

Ningtao Wang, Yaqun Wang, Han Hao, Luojun Wang, Zhong Wang, Jianxin Wang, Rongling Wu

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important.We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation-maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene-environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.

Original languageEnglish (US)
Article numberbbt029
Pages (from-to)534-541
Number of pages8
JournalBriefings in bioinformatics
Issue number4
StatePublished - Jul 2014

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Molecular Biology


Dive into the research topics of 'A bi-Poisson model for clustering gene expression profiles by RNA-seq'. Together they form a unique fingerprint.

Cite this