TY - CHAP
T1 - Information-Theoretic Inference of an Optimal Dictionary of Protein Supersecondary Structures
AU - Konagurthu, Arun S.
AU - Subramanian, Ramanan
AU - Allison, Lloyd
AU - Abramson, David
AU - de la Banda, Maria Garcia
AU - Stuckey, Peter J.
AU - Lesk, Arthur M.
N1 - Publisher Copyright:
© 2019, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2019
Y1 - 2019
N2 - We recently developed an unsupervised Bayesian inference methodology to automatically infer a dictionary of protein supersecondary structures (Subramanian et al., IEEE data compression conference proceedings (DCC), 340–349, 2017). Specifically, this methodology uses the information-theoretic framework of minimum message length (MML) criterion for hypothesis selection (Wallace, Statistical and inductive inference by minimum message length, Springer Science & Business Media, New York, 2005). The best dictionary of supersecondary structures is the one that yields the most (lossless) compression on the source collection of folding patterns represented as tableaux (matrix representations that capture the essence of protein folding patterns (Lesk, J Mol Graph. 13:159–164, 1995). This book chapter outlines our MML methodology for inferring the supersecondary structure dictionary. The inferred dictionary is available at http://lcb.infotech.monash.edu.au/proteinConcepts/scop100/dictionary.html.
AB - We recently developed an unsupervised Bayesian inference methodology to automatically infer a dictionary of protein supersecondary structures (Subramanian et al., IEEE data compression conference proceedings (DCC), 340–349, 2017). Specifically, this methodology uses the information-theoretic framework of minimum message length (MML) criterion for hypothesis selection (Wallace, Statistical and inductive inference by minimum message length, Springer Science & Business Media, New York, 2005). The best dictionary of supersecondary structures is the one that yields the most (lossless) compression on the source collection of folding patterns represented as tableaux (matrix representations that capture the essence of protein folding patterns (Lesk, J Mol Graph. 13:159–164, 1995). This book chapter outlines our MML methodology for inferring the supersecondary structure dictionary. The inferred dictionary is available at http://lcb.infotech.monash.edu.au/proteinConcepts/scop100/dictionary.html.
UR - http://www.scopus.com/inward/record.url?scp=85064240074&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064240074&partnerID=8YFLogxK
U2 - 10.1007/978-1-4939-9161-7_6
DO - 10.1007/978-1-4939-9161-7_6
M3 - Chapter
C2 - 30945216
AN - SCOPUS:85064240074
T3 - Methods in Molecular Biology
SP - 123
EP - 131
BT - Methods in Molecular Biology
PB - Humana Press Inc.
ER -