TY - JOUR
T1 - Predicting protein folds with fold-specific pssm libraries
AU - Hong, Yoojin
AU - Chintapalli, Sree Vamsee
AU - Ko, Kyung Dae
AU - Bhardwaj, Gaurav
AU - Zhang, Zhenhai
AU - van Rossum, Damian
AU - Patterson, Randen L.
PY - 2011
Y1 - 2011
N2 - Accurately assigning folds for divergent protein sequences is a major obstacle to structural studies. Herein, we outline an effective method for fold recognition using sets of PSSMs, each of which is constructed for different protein folds. Our analyses demonstrate that FSL (Fold-specific Position Specific Scoring Matrix Libraries) can predict/relate structures given only their amino acid sequences of highly divergent proteins. This ability to detect distant relationships is dependent on low-identity sequence alignments obtained from FSL. Results from our experiments demonstrate that FSL perform well in recognizing folds from the "twilight-zone" SABmark dataset. Further, this method is capable of accurate fold prediction in newly determined structures. We suggest that by building complete PSSM libraries for all unique folds within the Protein Database (PDB), FSL can be used to rapidly and reliably annotate a large subset of protein folds at proteomic level. The related programs and fold-specific PSSMs for our FSL are publicly available at: http://ccp.psu.edu/download/FSLv1.0/.
AB - Accurately assigning folds for divergent protein sequences is a major obstacle to structural studies. Herein, we outline an effective method for fold recognition using sets of PSSMs, each of which is constructed for different protein folds. Our analyses demonstrate that FSL (Fold-specific Position Specific Scoring Matrix Libraries) can predict/relate structures given only their amino acid sequences of highly divergent proteins. This ability to detect distant relationships is dependent on low-identity sequence alignments obtained from FSL. Results from our experiments demonstrate that FSL perform well in recognizing folds from the "twilight-zone" SABmark dataset. Further, this method is capable of accurate fold prediction in newly determined structures. We suggest that by building complete PSSM libraries for all unique folds within the Protein Database (PDB), FSL can be used to rapidly and reliably annotate a large subset of protein folds at proteomic level. The related programs and fold-specific PSSMs for our FSL are publicly available at: http://ccp.psu.edu/download/FSLv1.0/.
UR - http://www.scopus.com/inward/record.url?scp=79959199830&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79959199830&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0020557
DO - 10.1371/journal.pone.0020557
M3 - Article
C2 - 21698189
AN - SCOPUS:79959199830
SN - 1932-6203
VL - 6
JO - PloS one
JF - PloS one
IS - 6
M1 - e20557
ER -