TY - JOUR

T1 - The inverse protein folding problem on 2D and 3D lattices

AU - Berman, Piotr

AU - DasGupta, Bhaskar

AU - Mubayi, Dhruv

AU - Sloan, Robert

AU - Turán, György

AU - Zhang, Yi

PY - 2007/4/1

Y1 - 2007/4/1

N2 - In this paper we investigate the inverse protein folding (IPF) problem under the Canonical model on 3D and 2D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136; E.I. Shakhnovich, A.M. Gutin, Engineering of stable and fast-folding sequences of model proteins, Proc. Natl. Acad. Sci. 90 (1993) 7195-7199]. In this problem, we are given a contact graph G = (V, E) of a protein sequence that is embeddable in a 3D (respectively, 2D) lattice and an integer 1 ≤ K ≤ | V |. The goal is to find an induced subgraph of G of at most K vertices with the maximum number of edges. In this paper, we prove the following results:•An earlier proof of NP-completeness of the IPF problem on 3D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136] is based on the NP-completeness of the IPF problem on the 2D lattices. However, the reduction was not correct and we show that the IPF problem for 2D lattices can be solved in O (K | V |) time. But, we show that the IPF problem on 3D lattices is indeed NP-complete by a providing a different reduction from a different NP-complete problem.•We design a polynomial-time approximation scheme for the IPF problem on 3D lattices using the shifted slice-and-dice approach in [P. Berman, B. DasGupta, S. Muthukrishnan, Approximation algorithms for MAX-MIN tiling, J. Algorithms 47(2) (2003) 122-134; D. Hochbaum, Approximation Algorithms for NP-hard Problems, PWS Publishing Company, MA, 1997; D.S. Hochbaum, W. Mass, Approximation schemes for covering and packing problems in image processing and VLSI, J. ACM 32(1) (1985) 130-136], thereby improving the previous best polynomial-time approximation algorithm which had a performance ratio of frac(1, 2) [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136].

AB - In this paper we investigate the inverse protein folding (IPF) problem under the Canonical model on 3D and 2D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136; E.I. Shakhnovich, A.M. Gutin, Engineering of stable and fast-folding sequences of model proteins, Proc. Natl. Acad. Sci. 90 (1993) 7195-7199]. In this problem, we are given a contact graph G = (V, E) of a protein sequence that is embeddable in a 3D (respectively, 2D) lattice and an integer 1 ≤ K ≤ | V |. The goal is to find an induced subgraph of G of at most K vertices with the maximum number of edges. In this paper, we prove the following results:•An earlier proof of NP-completeness of the IPF problem on 3D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136] is based on the NP-completeness of the IPF problem on the 2D lattices. However, the reduction was not correct and we show that the IPF problem for 2D lattices can be solved in O (K | V |) time. But, we show that the IPF problem on 3D lattices is indeed NP-complete by a providing a different reduction from a different NP-complete problem.•We design a polynomial-time approximation scheme for the IPF problem on 3D lattices using the shifted slice-and-dice approach in [P. Berman, B. DasGupta, S. Muthukrishnan, Approximation algorithms for MAX-MIN tiling, J. Algorithms 47(2) (2003) 122-134; D. Hochbaum, Approximation Algorithms for NP-hard Problems, PWS Publishing Company, MA, 1997; D.S. Hochbaum, W. Mass, Approximation schemes for covering and packing problems in image processing and VLSI, J. ACM 32(1) (1985) 130-136], thereby improving the previous best polynomial-time approximation algorithm which had a performance ratio of frac(1, 2) [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136].

UR - http://www.scopus.com/inward/record.url?scp=33846807406&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33846807406&partnerID=8YFLogxK

U2 - 10.1016/j.dam.2005.09.018

DO - 10.1016/j.dam.2005.09.018

M3 - Article

AN - SCOPUS:33846807406

SN - 0166-218X

VL - 155

SP - 719

EP - 732

JO - Discrete Applied Mathematics

JF - Discrete Applied Mathematics

IS - 6-7

ER -