Genome minimized strains offer advantages as production chassis by reducing transcriptional cost, eliminating competing functions and limiting unwanted regulatory interactions. Existing approaches for identifying stretches of DNA to remove are largely ad hoc based on information on presumably dispensable regions through experimentally determined nonessential genes and comparative genomics. Here we introduce a versatile genome reduction algorithm MinGenome that implements a mixed-integer linear programming (MILP) algorithm to identify in size descending order all dispensable contiguous sequences without affecting the organism's growth or other desirable traits. Known essential genes or genes that cause significant fitness or performance loss can be flagged and their deletion can be prohibited. MinGenome also preserves needed transcription factors and promoter regions ensuring that retained genes will be properly transcribed while also avoiding the simultaneous deletion of synthetic lethal pairs. The potential benefit of removing even larger contiguous stretches of DNA if only one or two essential genes (to be reinserted elsewhere) are within the deleted sequence is explored. We applied the algorithm to design a minimized E. coli strain and found that we were able to recapitulate the long deletions identified in previous experimental studies and discover alternative combinations of deletions that have not yet been explored in vivo.
All Science Journal Classification (ASJC) codes
- Biomedical Engineering
- Biochemistry, Genetics and Molecular Biology (miscellaneous)