TY - JOUR
T1 - The inverse protein folding problem on 2D and 3D lattices
AU - Berman, Piotr
AU - DasGupta, Bhaskar
AU - Mubayi, Dhruv
AU - Sloan, Robert
AU - Turán, György
AU - Zhang, Yi
PY - 2007/4/1
Y1 - 2007/4/1
N2 - In this paper we investigate the inverse protein folding (IPF) problem under the Canonical model on 3D and 2D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136; E.I. Shakhnovich, A.M. Gutin, Engineering of stable and fast-folding sequences of model proteins, Proc. Natl. Acad. Sci. 90 (1993) 7195-7199]. In this problem, we are given a contact graph G = (V, E) of a protein sequence that is embeddable in a 3D (respectively, 2D) lattice and an integer 1 ≤ K ≤ | V |. The goal is to find an induced subgraph of G of at most K vertices with the maximum number of edges. In this paper, we prove the following results:•An earlier proof of NP-completeness of the IPF problem on 3D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136] is based on the NP-completeness of the IPF problem on the 2D lattices. However, the reduction was not correct and we show that the IPF problem for 2D lattices can be solved in O (K | V |) time. But, we show that the IPF problem on 3D lattices is indeed NP-complete by a providing a different reduction from a different NP-complete problem.•We design a polynomial-time approximation scheme for the IPF problem on 3D lattices using the shifted slice-and-dice approach in [P. Berman, B. DasGupta, S. Muthukrishnan, Approximation algorithms for MAX-MIN tiling, J. Algorithms 47(2) (2003) 122-134; D. Hochbaum, Approximation Algorithms for NP-hard Problems, PWS Publishing Company, MA, 1997; D.S. Hochbaum, W. Mass, Approximation schemes for covering and packing problems in image processing and VLSI, J. ACM 32(1) (1985) 130-136], thereby improving the previous best polynomial-time approximation algorithm which had a performance ratio of frac(1, 2) [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136].
AB - In this paper we investigate the inverse protein folding (IPF) problem under the Canonical model on 3D and 2D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136; E.I. Shakhnovich, A.M. Gutin, Engineering of stable and fast-folding sequences of model proteins, Proc. Natl. Acad. Sci. 90 (1993) 7195-7199]. In this problem, we are given a contact graph G = (V, E) of a protein sequence that is embeddable in a 3D (respectively, 2D) lattice and an integer 1 ≤ K ≤ | V |. The goal is to find an induced subgraph of G of at most K vertices with the maximum number of edges. In this paper, we prove the following results:•An earlier proof of NP-completeness of the IPF problem on 3D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136] is based on the NP-completeness of the IPF problem on the 2D lattices. However, the reduction was not correct and we show that the IPF problem for 2D lattices can be solved in O (K | V |) time. But, we show that the IPF problem on 3D lattices is indeed NP-complete by a providing a different reduction from a different NP-complete problem.•We design a polynomial-time approximation scheme for the IPF problem on 3D lattices using the shifted slice-and-dice approach in [P. Berman, B. DasGupta, S. Muthukrishnan, Approximation algorithms for MAX-MIN tiling, J. Algorithms 47(2) (2003) 122-134; D. Hochbaum, Approximation Algorithms for NP-hard Problems, PWS Publishing Company, MA, 1997; D.S. Hochbaum, W. Mass, Approximation schemes for covering and packing problems in image processing and VLSI, J. ACM 32(1) (1985) 130-136], thereby improving the previous best polynomial-time approximation algorithm which had a performance ratio of frac(1, 2) [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128-136].
UR - http://www.scopus.com/inward/record.url?scp=33846807406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33846807406&partnerID=8YFLogxK
U2 - 10.1016/j.dam.2005.09.018
DO - 10.1016/j.dam.2005.09.018
M3 - Article
AN - SCOPUS:33846807406
SN - 0166-218X
VL - 155
SP - 719
EP - 732
JO - Discrete Applied Mathematics
JF - Discrete Applied Mathematics
IS - 6-7
ER -