Quasi-prime peptides: identification of the shortest peptide sequences unique to a species

Ioannis Mouratidis, Candace S.Y. Chan, Nikol Chantzi, Georgios Christos Tsiatsianis, Martin Hemberg, Nadav Ahituv, Ilias Georgakopoulos-Soares

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Determining the organisms present in a biosample has many important applications in agriculture, wildlife conservation, and healthcare. Here, we develop a universal fingerprint based on the identification of short peptides that are unique to a specific organism. We define quasi-prime peptides as sequences that are found in only one species, and we analyzed proteomes from 21 875 species, from viruses to humans, and annotated the smallest peptide kmer sequences that are unique to a species and absent from all other proteomes. We also perform simulations across all reference proteomes and observe a lower than expected number of peptide kmers across species and taxonomies, indicating an enrichment for nullpeptides, sequences absent from a proteome. For humans, we find that quasi-primes are found in genes enriched for specific gene ontology terms, including proteasome and ATP and GTP catalysis. We also provide a set of quasi-prime peptides for a number of human pathogens and model organisms and further showcase its utility via two case studies for Mycobacterium tuberculosis and Vibrio cholerae, where we identify quasi-prime peptides in two transmembrane and extracellular proteins with relevance for pathogen detection. Our catalog of quasi-prime peptides provides the smallest unit of information that is specific to a single organism at the protein level, providing a versatile tool for species identification.

Original languageEnglish (US)
Article numberlqad039
JournalNAR Genomics and Bioinformatics
Volume5
Issue number2
DOIs
StatePublished - Jun 1 2023

All Science Journal Classification (ASJC) codes

  • Structural Biology
  • Molecular Biology
  • Genetics
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Quasi-prime peptides: identification of the shortest peptide sequences unique to a species'. Together they form a unique fingerprint.

Cite this