This project is motivated by the large number of genes of unknown function found in sequenced genomes. Genome sequencing and bioinformatics has transformed understanding of the capabilities of living organisms, leading to broad impacts on biotechnology, agriculture, health, and the environment. Despite these advances, the functions of many genes remain unknown, even for well-studied model organisms such as Escherichia coli. In non-model organisms, most functional assignments are not based on direct experimental evidence, but instead are based on sequence homology and/or shared patterns with known genes across datasets. Automated bioinformatics algorithms have increased the rate of genome annotation, but unfortunately fail to assign functions to 40-60% of all new gene sequences, and worse, exhibit a high rate of mis-annotation. These omissions and propagated errors complicate efforts to computationally model, understand, and engineer organisms and higher living systems. For this reason, it is critical to develop tools for rapidly identifying functions of genes in non-model organisms to better understand their genotype-phenotype relationships. This project will generate such tools and resources (algorithms, strains, and plasmids), share them with the broader scientific community, and provide related training to help reduce these gaps in knowledge and accelerate linking genotypes to phenotypes on a large scale. The project will also create systems and synthetic biology research opportunities for undergraduate and graduate students, and involve outreach efforts to engage K-12 students and the general public.This project will address the fundamental problem described above for all biological systems by developing a gene annotation pipeline (GAP) toolbox to identify genotypes and media conditions that select for genes encoding enzymes and transporters that catalyze a metabolic reaction of interest. Synthetic biology methods will then be employed to assemble a library of microbial strains that can select for key metabolic reactions using a small set of non-permissive conditions. Together these tools will be used to close the metabolic knowledge gaps in two important rhizosphere microbes as test cases for the GAP toolbox. The resulting links between genes and reactions will deepen understanding of metabolism in the rhizosphere and enable future basic research on soil as well as novel agricultural biotechnologies. More broadly, improved annotations resulting from this work will be propagated to other sequenced genomes where homologs exist, increasing global understanding of metabolism in other biological systems as well.This project is co-funded by the Genetic Mechanisms program of the Molecular and Cellular Biosciences Division in the Biological Sciences Directorate.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|Effective start/end date||8/15/23 → 7/31/26|
- National Science Foundation: $382,727.00
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.