TY - JOUR
T1 - An automated model test system for systematic development and improvement of gene expression models
AU - Reis, Alexander C.
AU - Salis, Howard M.
N1 - Publisher Copyright:
© 2020 American Chemical Society
PY - 2020/11/20
Y1 - 2020/11/20
N2 - Gene expression models greatly accelerate the engineering of synthetic metabolic pathways and genetic circuits by predicting sequence-function relationships and reducing trial- and-error experimentation. However, developing models with more accurate predictions remains a significant challenge. Here we present a model test system that combines advanced statistics, machine learning, and a database of 9862 characterized genetic systems to automatically quantify model accuracies, accept or reject mechanistic hypotheses, and identify areas for model improvement. We also introduce model capacity, a new information theoretic metric for correct cross-data-set comparisons. We demonstrate the model test system by comparing six models of translation initiation rate, evaluating 100 mechanistic hypotheses, and uncovering new sequence determinants that control protein expression levels. We then applied these results to develop a biophysical model of translation initiation rate with significant improvements in accuracy. Automated model test systems will dramatically accelerate the development of gene expression models, and thereby transition synthetic biology into a mature engineering discipline.
AB - Gene expression models greatly accelerate the engineering of synthetic metabolic pathways and genetic circuits by predicting sequence-function relationships and reducing trial- and-error experimentation. However, developing models with more accurate predictions remains a significant challenge. Here we present a model test system that combines advanced statistics, machine learning, and a database of 9862 characterized genetic systems to automatically quantify model accuracies, accept or reject mechanistic hypotheses, and identify areas for model improvement. We also introduce model capacity, a new information theoretic metric for correct cross-data-set comparisons. We demonstrate the model test system by comparing six models of translation initiation rate, evaluating 100 mechanistic hypotheses, and uncovering new sequence determinants that control protein expression levels. We then applied these results to develop a biophysical model of translation initiation rate with significant improvements in accuracy. Automated model test systems will dramatically accelerate the development of gene expression models, and thereby transition synthetic biology into a mature engineering discipline.
UR - http://www.scopus.com/inward/record.url?scp=85096071584&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096071584&partnerID=8YFLogxK
U2 - 10.1021/acssynbio.0c00394
DO - 10.1021/acssynbio.0c00394
M3 - Article
C2 - 33054181
AN - SCOPUS:85096071584
SN - 2161-5063
VL - 9
SP - 3145
EP - 3156
JO - ACS Synthetic Biology
JF - ACS Synthetic Biology
IS - 11
ER -