ROBUST SHAPE MATRIX ESTIMATION FOR HIGH-DIMENSIONAL COMPOSITIONAL DATA WITH APPLICATION TO MICROBIAL INTER-TAXA ANALYSIS

Danning Li, Arun Srinivasan, Lingzhou Xue, Xiang Zhan

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Estimating the dependence structure in the data is a key task when analyzing compositional data. Real-world compositional data sets are often complex owing to high-dimensionality, heavy tails, and the possible existence of outliers. We consider a general class of elliptical distributions to model the heavy-tailed distribution of latent log-basis variables, which is characterized by a latent shape matrix. The latent shape matrix is a scalar multiple of the latent covariance matrix, when it exists, and it can preserve the directional properties of the dependence in a distribution when the covariance matrix does not exist. We propose using a robust composition-adjusted thresholding procedure based on Tyler’s M-estimator to estimate the latent shape matrices of high-dimensional compositional data from different groups. We prove appealing theoretical properties under the high-dimensional setting. Simulation studies and a real application to microbial inter-taxa analysis demonstrate the numerical properties of the proposed method.

Original languageEnglish (US)
Pages (from-to)1577-1602
Number of pages26
JournalStatistica Sinica
Volume33
DOIs
StatePublished - May 2023

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this