EDGE CMT: Predicting bacteriophage susceptibility from Escherichia coli genotype

  • Arkin, Adam P. (PI)
  • Listgarten, Jennifer (CoPI)
  • Mutalik, Vivek K. (CoPI)
  • Dudley, Edward G. (CoPI)

Project: Research project

Project Details


This award to the University of California-Berkeley is made to support investigations of host-pathogen interactions using bacteria and phage as a tractable experimental system. The bacterial virome, the collection of viruses that parasitize the microbes, is a critical feature of microbial community dynamics, activity and adaptation. As part of these community dynamic, bacterial viruses (bacteriophage or phages) attack exceptionally specific bacterial hosts, much like other viruses infect only very-specific plants or animals. However, the mechanisms underlying this specificity are deeply under-characterized and studies have largely focused on a handful of individual bacterium-phage systems. The lack of insights into phage specificity and the breadth of bacterial responses to different phages has limited our ability to build models that can predict which phages have the potential to infect specific bacterial strains. Research supported by this award will help to fill this knowledge gap for an environmentally and medically important species of bacteria and its phage by exploiting an extensive collection of thousands of non-model Escherichia coli (e. coli) strains originating from hundreds of environmental and animal reservoirs. The researchers will use a correspondingly diverse collection of phages alongside high-throughput genetics and measurement to map the susceptibility of these bacteria to infection and create models to predict, given the genome of a new strain of E. coli, which phage might be most effective at targeting it. Graduate students and postdoctoral trainees from under-represented groups will be supported by this award, and the researchers will use data from these studies for data science training efforts available to larger groups. Results of this effort may eventually lead to effective bio-control options to manage bacterial populations, potentially reducing the need for antimicrobial use in a wide range of application areas including agriculture, sanitation, industrial processes, and biomedical environments.Since their discovery 100 years ago, our knowledge of phage abundance, diversity, modes of infectivity and their contribution to horizontal gene transfer, microbiome structure and functional traits is limited to few environmental contexts and individual bacterium-phage systems. Research supported by this award will create a machine-learning-driven experimental workflow that exploits a natural genetic variation in bacterial strains and associated phages, scalable susceptibility assays and high throughput genetics to create a predictive model connecting bacterial genotype to phage susceptibility phenotype. The researchers will leverage an extensive collection of non-model e. coli strains originating from hundreds of reservoirs and geographic locations representative of agricultural, medical and environmentally important species. The researchers will employ high-throughput genomics and genetics to gain a mechanistic understanding of thousands of phage-host interactions necessary to build predictive models linking bacterial genotype to phage susceptibility phenotype. The success of this project will advance our understanding of complex susceptibility phenotypes and could enable development of rational phage-cocktail formulations to treat drug resistance infections, implement biocontrol measures and enable precision microbiome engineering applications. The results of the studies will be presented at scientific meetings and published in peer-reviewed journals.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Effective start/end date10/1/229/30/26


  • National Science Foundation: $2,055,157.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.