TY - JOUR
T1 - High Satellite Repeat Turnover in Great Apes Studied with Short- And Long-Read Technologies
AU - Cechova, Monika
AU - Harris, Robert S.
AU - Tomaszkiewicz, Marta
AU - Arbeithuber, Barbara
AU - Chiaromonte, Francesca
AU - Makova, Kateryna D.
N1 - Funding Information:
We are grateful to Marzia Cremona, Kate Anthony, Oliver Ryder, Mark Shriver, Malcolm Ferguson-Smith, Jorge Pereira, and Shaun Mahony for their assistance. We thank Wilfried Guiblet and Arslan Zaidi for valuable biological insights. This work was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R01GM130691. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Funding was also provided by the Eberly College of Sciences, The Huck Institute of Life Sciences, and the Institute for CyberScience, at Penn State, as well as, in part, under grants from the Pennsylvania Department of Health using Tobacco Settlement and CURE Funds. The department specifically disclaims any responsibility for any analyses, responsibility, or conclusions.
Publisher Copyright:
© 2019 The Author(s). Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
PY - 2019/11/1
Y1 - 2019/11/1
N2 - Satellite repeats are a structural component of centromeres and telomeres, and in some instances, their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50 bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: 1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and 2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However, clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males versus females; using Y chromosome assemblies or Fluorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59 kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions.
AB - Satellite repeats are a structural component of centromeres and telomeres, and in some instances, their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50 bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: 1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and 2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However, clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males versus females; using Y chromosome assemblies or Fluorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59 kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions.
UR - http://www.scopus.com/inward/record.url?scp=85082129496&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082129496&partnerID=8YFLogxK
U2 - 10.1093/molbev/msz156
DO - 10.1093/molbev/msz156
M3 - Article
C2 - 31273383
AN - SCOPUS:85082129496
SN - 0737-4038
VL - 36
SP - 2415
EP - 2431
JO - Molecular biology and evolution
JF - Molecular biology and evolution
IS - 11
ER -