Toward a Three-Dimensional Chromosome Shape Alphabet

Carlos Soto, Darshan Bryner, Nicola Neretti, Anuj Srivastava

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


The study of the three-dimensional (3D) structure of chromosomes - the largest macromolecules in biology - is one of the most challenging to date in structural biology. Here, we develop a novel representation of 3D chromosome structures, as sequences of shape letters from a finite shape alphabet, which provides a compact and efficient way to analyze ensembles of chromosome shape data, akin to the analysis of texts in a language by using letters. We construct a Chromosome Shape Alphabet from an ensemble of chromosome 3D structures inferred from Hi-C data - via SIMBA3D or other methods - by segmenting curves based on topologically associating domains (TADs) boundaries, and by clustering all TADs' 3D structures into groups of similar shapes. The median shapes of these groups, with some pruning and processing, form the Chromosome Shape Letters (CSLs) of the alphabet. We provide a proof of concept for these CSLs by reconstructing independent test curves by using only CSLs (and corresponding transformations) and comparing these reconstructions with the original curves. Finally, we demonstrate how CSLs can be used to summarize shapes in an ensemble of chromosome 3D structures by using generalized sequence logos.

Original languageEnglish (US)
Pages (from-to)601-618
Number of pages18
JournalJournal of Computational Biology
Issue number6
StatePublished - Jun 1 2021

All Science Journal Classification (ASJC) codes

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics


Dive into the research topics of 'Toward a Three-Dimensional Chromosome Shape Alphabet'. Together they form a unique fingerprint.

Cite this