TY - GEN
T1 - Improving protein-RNA interface prediction by combining sequence homology based method with a naive bayes classifier
T2 - 2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, ACM-BCB 2011
AU - Xue, Li C.
AU - Walia, Rasna
AU - EL-Manzalawy, Yasser
AU - Dobbs, Drena
AU - Honavar, Vasant
PY - 2011
Y1 - 2011
N2 - Protein-RNA interactions play important roles in cellular processes like protein synthesis, RNA processing, and gene expression regulation. Reliable identification of the interfaces involved in RNA-protein interactions is essential for comprehending the mechanisms and the functional implications of these interactions and provides a valuable guide for rational drug discovery and design. Because the determination of 3D structures of protein-RNA complexes has various technical limitations and is typically costly, reliable in silico interface prediction methods that require only the sequence information are urgently needed. We present HomPRIP, a homologous sequence based method for predicting protein-RNA interfaces, based on our conservation analysis of protein-RNA interfaces. We test Hom-PRIP on a benchmark dataset of 199 proteins and compare it with the state-of-the-art protein-RNA interface prediction methods. Our results show that HomPRIP can reliably identify protein-RNA interface residues in 71% of test proteins with at least one putative sequence homolog passing the similarity thresholds of HomPRIP. Moreover, to facilitate predictions for proteins with no identified homologs, we develop HomPRIP-NB, a method combining the HomPRIP predictor and a Naive Bayes (NB) classifier trained using evolutionary information derived from alignments against the NCBI nr database. Our results suggest that HomPRIP-NB significantly outperforms the state-of-the-art machine learning methods for predicting protein-RNA interface residues.
AB - Protein-RNA interactions play important roles in cellular processes like protein synthesis, RNA processing, and gene expression regulation. Reliable identification of the interfaces involved in RNA-protein interactions is essential for comprehending the mechanisms and the functional implications of these interactions and provides a valuable guide for rational drug discovery and design. Because the determination of 3D structures of protein-RNA complexes has various technical limitations and is typically costly, reliable in silico interface prediction methods that require only the sequence information are urgently needed. We present HomPRIP, a homologous sequence based method for predicting protein-RNA interfaces, based on our conservation analysis of protein-RNA interfaces. We test Hom-PRIP on a benchmark dataset of 199 proteins and compare it with the state-of-the-art protein-RNA interface prediction methods. Our results show that HomPRIP can reliably identify protein-RNA interface residues in 71% of test proteins with at least one putative sequence homolog passing the similarity thresholds of HomPRIP. Moreover, to facilitate predictions for proteins with no identified homologs, we develop HomPRIP-NB, a method combining the HomPRIP predictor and a Naive Bayes (NB) classifier trained using evolutionary information derived from alignments against the NCBI nr database. Our results suggest that HomPRIP-NB significantly outperforms the state-of-the-art machine learning methods for predicting protein-RNA interface residues.
UR - http://www.scopus.com/inward/record.url?scp=84858987590&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84858987590&partnerID=8YFLogxK
U2 - 10.1145/2147805.2147899
DO - 10.1145/2147805.2147899
M3 - Conference contribution
AN - SCOPUS:84858987590
SN - 9781450307963
T3 - 2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011
SP - 556
EP - 558
BT - 2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011
Y2 - 1 August 2011 through 3 August 2011
ER -