Universal Architectural Concepts Underlying Protein Folding Patterns

Arun S. Konagurthu, Ramanan Subramanian, Lloyd Allison, David Abramson, Peter J. Stuckey, Maria Garcia de la Banda, Arthur M. Lesk

Research output: Contribution to journalArticlepeer-review

8 Scopus citations


What is the architectural “basis set” of the observed universe of protein structures? Using information-theoretic inference, we answer this question with a dictionary of 1,493 substructures—called concepts—typically at a subdomain level, based on an unbiased subset of known protein structures. Each concept represents a topologically conserved assembly of helices and strands that make contact. Any protein structure can be dissected into instances of concepts from this dictionary. We dissected the Protein Data Bank and completely inventoried all the concept instances. This yields many insights, including correlations between concepts and catalytic activities or binding sites, useful for rational drug design; local amino-acid sequence–structure correlations, useful for ab initio structure prediction methods; and information supporting the recognition and exploration of evolutionary relationships, useful for structural studies. An interactive site, Proçodic, at http://lcb.infotech.monash.edu.au/prosodic (click), provides access to and navigation of the entire dictionary of concepts and their usages, and all associated information. This report is part of a continuing programme with the goal of elucidating fundamental principles of protein architecture, in the spirit of the work of Cyrus Chothia.

Original languageEnglish (US)
Article number612920
JournalFrontiers in Molecular Biosciences
StatePublished - Apr 30 2021

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)


Dive into the research topics of 'Universal Architectural Concepts Underlying Protein Folding Patterns'. Together they form a unique fingerprint.

Cite this