TY - JOUR
T1 - Comparative Genomic Analysis of East Asian and Non-Asian Helicobacter pylori Strains Identifies Rapidly Evolving Genes
AU - Duncan, Stacy S.
AU - Valk, Pieter L.
AU - McClain, Mark S.
AU - Shaffer, Carrie L.
AU - Metcalf, Jason A.
AU - Bordenstein, Seth R.
AU - Cover, Timothy L.
PY - 2013/1/31
Y1 - 2013/1/31
N2 - Helicobacter pylori infection is a risk factor for the development of gastric adenocarcinoma, a disease that has a high incidence in East Asia. Genes that are highly divergent in East Asian H. pylori strains compared to non-Asian strains are predicted to encode proteins that differ in functional activity and could represent novel determinants of virulence. To identify such proteins, we undertook a comparative analysis of sixteen H. pylori genomes, selected equally from strains classified as East Asian or non-Asian. As expected, the deduced sequences of two known virulence determinants (CagA and VacA) are highly divergent, with 77% and 87% mean amino acid sequence identities between East Asian and non-Asian groups, respectively. In total, we identified 57 protein sequences that are highly divergent between East Asian and non-Asian strains, but relatively conserved within East Asian strains. The most highly represented functional groups are hypothetical proteins, cell envelope proteins and proteins involved in DNA metabolism. Among the divergent genes with known or predicted functions, population genetic analyses indicate that 86% exhibit evidence of positive selection. McDonald-Kreitman tests further indicate that about one third of these highly divergent genes, including cagA and vacA, are under diversifying selection. We conclude that, similar to cagA and vacA, most of the divergent genes identified in this study evolved under positive selection, and represent candidate factors that may account for the disproportionately high incidence of gastric cancer associated with East Asian H. pylori strains. Moreover, these divergent genes represent robust biomarkers that can be used to differentiate East Asian and non-Asian H. pylori strains.
AB - Helicobacter pylori infection is a risk factor for the development of gastric adenocarcinoma, a disease that has a high incidence in East Asia. Genes that are highly divergent in East Asian H. pylori strains compared to non-Asian strains are predicted to encode proteins that differ in functional activity and could represent novel determinants of virulence. To identify such proteins, we undertook a comparative analysis of sixteen H. pylori genomes, selected equally from strains classified as East Asian or non-Asian. As expected, the deduced sequences of two known virulence determinants (CagA and VacA) are highly divergent, with 77% and 87% mean amino acid sequence identities between East Asian and non-Asian groups, respectively. In total, we identified 57 protein sequences that are highly divergent between East Asian and non-Asian strains, but relatively conserved within East Asian strains. The most highly represented functional groups are hypothetical proteins, cell envelope proteins and proteins involved in DNA metabolism. Among the divergent genes with known or predicted functions, population genetic analyses indicate that 86% exhibit evidence of positive selection. McDonald-Kreitman tests further indicate that about one third of these highly divergent genes, including cagA and vacA, are under diversifying selection. We conclude that, similar to cagA and vacA, most of the divergent genes identified in this study evolved under positive selection, and represent candidate factors that may account for the disproportionately high incidence of gastric cancer associated with East Asian H. pylori strains. Moreover, these divergent genes represent robust biomarkers that can be used to differentiate East Asian and non-Asian H. pylori strains.
UR - http://www.scopus.com/inward/record.url?scp=84873149473&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84873149473&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0055120
DO - 10.1371/journal.pone.0055120
M3 - Article
C2 - 23383074
AN - SCOPUS:84873149473
SN - 1932-6203
VL - 8
JO - PloS one
JF - PloS one
IS - 1
M1 - e55120
ER -