Composite score analysis for unsupervised comparison and network visualization of metabolomics data

Joshua J. Kellogg, Olav M. Kvalheim, Nadja B. Cech

Research output: Contribution to journalArticlepeer-review

16 Scopus citations


Metabolomics-based approaches are becoming increasingly popular to interrogate the chemical basis for phenotypic differences in biological systems. Successful metabolomics studies employ multivariate data analysis to compare large and highly complex datasets. A primary tool for unsupervised statistical analyses, principal component analysis (PCA), relies on the selection of a subsection of a maximum of three components from a larger model to visually represent similarity. The use of only three principal components limits the comprehensiveness of the model and can mask discrimination between samples. We have developed a new statistical metric, the composite score (CS), as a univariate statistic that incorporates multiple principal components to calculate a correlation matrix that enables quantitative comparisons of sample similarity between samples within one dataset based upon measured metabolome profiles. Composite score values were tabulated using profiles of complex extracts of dietary supplements from the plant Hydrastis canadensis (goldenseal) as a case study. Several outliers were unambiguously identified, and a PCA composite score network was developed to provide a graphical representation of the composite score matrix. Comparison with visualization using PCA score plots or dendrograms from hierarchical clustering analysis (HCA) demonstrates the utility of the composite score to as a tool for metabolomics studies that seek to quantify similarity among samples. An R-script for the calculation of composite score has been made available.

Original languageEnglish (US)
Pages (from-to)38-47
Number of pages10
JournalAnalytica Chimica Acta
StatePublished - Jan 25 2020

All Science Journal Classification (ASJC) codes

  • Analytical Chemistry
  • Biochemistry
  • Environmental Chemistry
  • Spectroscopy


Dive into the research topics of 'Composite score analysis for unsupervised comparison and network visualization of metabolomics data'. Together they form a unique fingerprint.

Cite this