Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression- activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs. Synopsis A computational approach for designing multi-enzyme pathways is presented that combines sequence optimization, kinetic modeling and accurate expression prediction. The method is illustrated for the direct forward engineering of robust circuits with targeted activities. Biophysical modeling and computational design are combined to create predictive sequence-expression- activity maps (SEAMAP) for multi-protein genetic systems. The algorithm designs the smallest number of variants with expression levels covering the largest part of the multi-protein expression space. The predictions are validated using 646 genetic system variants, encoded on plasmids and genomes and expressed in gram-negative and gram-positive bacteria A SEAMAP of a 3-enzyme biosynthesis pathway is used to optimize the pathway's sequences and expression levels for different design objectives. A computational approach for designing multi-enzyme pathways is presented that combines sequence optimization, kinetic modeling, and accurate expression prediction. The method is illustrated for the direct forward engineering of robust circuits with targeted activities.
All Science Journal Classification (ASJC) codes
- General Biochemistry, Genetics and Molecular Biology
- General Immunology and Microbiology
- General Agricultural and Biological Sciences
- Applied Mathematics