TY - JOUR
T1 - Selection and thermostability suggest G-quadruplexes are novel functional elements of the human genome
AU - Guiblet, Wilfried M.
AU - DeGiorgio, Michael
AU - Cheng, Xiaoheng
AU - Chiaromonte, Francesca
AU - Eckert, Kristin A.
AU - Huang, Yi Fei
AU - Makova, Kateryna D.
N1 - Publisher Copyright:
© 2021 Guiblet et al. This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
PY - 2021/7
Y1 - 2021/7
N2 - Approximately 1% of the human genome has the ability to fold into G-quadruplexes (G4s)—noncanonical strand-specific DNA structures forming at G-rich motifs. G4s regulate several key cellular processes (e.g., transcription) and have been hypothesized to participate in others (e.g., firing of replication origins). Moreover, G4s differ in their thermostability, and this may affect their function. Yet, G4s may also hinder replication, transcription, and translation and may increase genome instability and mutation rates. Therefore, depending on their genomic location, thermostability, and functionality, G4 loci might evolve under different selective pressures, which has never been investigated. Here we conducted the first genome-wide analysis of G4 distribution, thermostability, and selection. We found an overrepresentation, high thermostability, and purifying selection for G4s within genic components in which they are expected to be functional—promoters, CpG islands, and 5′ and 3′ UTRs. A similar pattern was observed for G4s within replication origins, enhancers, eQTLs, and TAD boundary regions, strongly suggesting their functionality. In contrast, G4s on the nontranscribed strand of exons were underrepresented, were unstable, and evolved neutrally. In general, G4s on the nontranscribed strand of genic components had lower density and were less stable than those on the transcribed strand, suggesting that the former are avoided at the RNA level. Across the genome, purifying selection was stronger at stable G4s. Our results suggest that purifying selection preserves the sequences of functional G4s, whereas nonfunctional G4s are too costly to be tolerated in the genome. Thus, G4s are emerging as fundamental, functional genomic elements.
AB - Approximately 1% of the human genome has the ability to fold into G-quadruplexes (G4s)—noncanonical strand-specific DNA structures forming at G-rich motifs. G4s regulate several key cellular processes (e.g., transcription) and have been hypothesized to participate in others (e.g., firing of replication origins). Moreover, G4s differ in their thermostability, and this may affect their function. Yet, G4s may also hinder replication, transcription, and translation and may increase genome instability and mutation rates. Therefore, depending on their genomic location, thermostability, and functionality, G4 loci might evolve under different selective pressures, which has never been investigated. Here we conducted the first genome-wide analysis of G4 distribution, thermostability, and selection. We found an overrepresentation, high thermostability, and purifying selection for G4s within genic components in which they are expected to be functional—promoters, CpG islands, and 5′ and 3′ UTRs. A similar pattern was observed for G4s within replication origins, enhancers, eQTLs, and TAD boundary regions, strongly suggesting their functionality. In contrast, G4s on the nontranscribed strand of exons were underrepresented, were unstable, and evolved neutrally. In general, G4s on the nontranscribed strand of genic components had lower density and were less stable than those on the transcribed strand, suggesting that the former are avoided at the RNA level. Across the genome, purifying selection was stronger at stable G4s. Our results suggest that purifying selection preserves the sequences of functional G4s, whereas nonfunctional G4s are too costly to be tolerated in the genome. Thus, G4s are emerging as fundamental, functional genomic elements.
UR - http://www.scopus.com/inward/record.url?scp=85108985312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85108985312&partnerID=8YFLogxK
U2 - 10.1101/gr.269589.120
DO - 10.1101/gr.269589.120
M3 - Article
C2 - 34187812
AN - SCOPUS:85108985312
SN - 1088-9051
VL - 31
SP - 1136
EP - 1149
JO - Genome research
JF - Genome research
IS - 7
ER -