TY - JOUR
T1 - Serpins in the Caenorhabditis elegans genome
AU - Whisstock, James C.
AU - Irving, James A.
AU - Bottomley, Stephen P.
AU - Pike, Robert N.
AU - Lesk, Arthur M.
PY - 1999/7/1
Y1 - 1999/7/1
N2 - Data mining in genome sequences can identify distant homologues of known protein families, and is most powerful if solved structures are available to reveal the three-dimensional implications of very dissimilar sequences. Here we describe putative serpin sequences identified with very high statistical significance in the Caenorhabditis elegans genome. When mapped onto vertebrate serpins such as α1-antitrypsin, they suggest novel structural features. Some appear complete, some show extensive deletions, and others appear to contain only the C-terminal part of the known serpin fold, probably in partnership with N-terminal regions that have conformations unlike those of known serpins. The observation of such striking sequence similarity, in proteins that must have significantly different overall structures, substantially extends the structural characteristics of the serpin family of proteins.
AB - Data mining in genome sequences can identify distant homologues of known protein families, and is most powerful if solved structures are available to reveal the three-dimensional implications of very dissimilar sequences. Here we describe putative serpin sequences identified with very high statistical significance in the Caenorhabditis elegans genome. When mapped onto vertebrate serpins such as α1-antitrypsin, they suggest novel structural features. Some appear complete, some show extensive deletions, and others appear to contain only the C-terminal part of the known serpin fold, probably in partnership with N-terminal regions that have conformations unlike those of known serpins. The observation of such striking sequence similarity, in proteins that must have significantly different overall structures, substantially extends the structural characteristics of the serpin family of proteins.
UR - http://www.scopus.com/inward/record.url?scp=0033168897&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0033168897&partnerID=8YFLogxK
U2 - 10.1002/(SICI)1097-0134(19990701)36:1<31::AID-PROT3>3.0.CO;2-Q
DO - 10.1002/(SICI)1097-0134(19990701)36:1<31::AID-PROT3>3.0.CO;2-Q
M3 - Article
C2 - 10373004
AN - SCOPUS:0033168897
SN - 0887-3585
VL - 36
SP - 31
EP - 41
JO - Proteins: Structure, Function and Genetics
JF - Proteins: Structure, Function and Genetics
IS - 1
ER -