TY - JOUR
T1 - Protein structure prediction improves the quality of amino-acid sequence alignment
AU - Lesk, Arthur M.
AU - Konagurthu, Arun S.
N1 - Publisher Copyright:
© 2022 Wiley Periodicals LLC.
PY - 2022/12
Y1 - 2022/12
N2 - The basic operation in analysis of protein evolution is alignment: the specification of residue-residue correspondences. A structural alignment is a specification of residue-residue correspondences based on the atomic positions in the structures of two or more proteins. It is well-known that structural alignments are more accurate, over a much wider range of divergence, than pairwise alignments based solely on sequences—for instance computed with the Needleman–Wunsch algorithm with affine gap penalties. Given the amino-acid sequences of two proteins, alignments based solely on the sequences fall into “daylight”, “twilight”, and “midnight” zones, in which the fidelity of the correspondences diminishes in accuracy, and in strength of ability to distinguish true homology from noise. The success of AlphaFold2 in template-free modeling of three-dimensional structures from one-dimensional amino-acid sequence information implies that: given the amino-acid sequences of two or more proteins, in the absence of experimentally determined structures, reliable alignments—even for very highly diverged proteins—could in many cases be achieved by applying AlphaFold2 to the sequences, and performing structural alignments of the models.
AB - The basic operation in analysis of protein evolution is alignment: the specification of residue-residue correspondences. A structural alignment is a specification of residue-residue correspondences based on the atomic positions in the structures of two or more proteins. It is well-known that structural alignments are more accurate, over a much wider range of divergence, than pairwise alignments based solely on sequences—for instance computed with the Needleman–Wunsch algorithm with affine gap penalties. Given the amino-acid sequences of two proteins, alignments based solely on the sequences fall into “daylight”, “twilight”, and “midnight” zones, in which the fidelity of the correspondences diminishes in accuracy, and in strength of ability to distinguish true homology from noise. The success of AlphaFold2 in template-free modeling of three-dimensional structures from one-dimensional amino-acid sequence information implies that: given the amino-acid sequences of two or more proteins, in the absence of experimentally determined structures, reliable alignments—even for very highly diverged proteins—could in many cases be achieved by applying AlphaFold2 to the sequences, and performing structural alignments of the models.
UR - http://www.scopus.com/inward/record.url?scp=85134176451&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134176451&partnerID=8YFLogxK
U2 - 10.1002/prot.26392
DO - 10.1002/prot.26392
M3 - Article
C2 - 35754316
AN - SCOPUS:85134176451
SN - 0887-3585
VL - 90
SP - 2144
EP - 2147
JO - Proteins: Structure, Function and Bioinformatics
JF - Proteins: Structure, Function and Bioinformatics
IS - 12
ER -