TY - JOUR
T1 - Genomic structural variants constrain and facilitate adaptation in natural populations of Theobroma cacao, the chocolate tree
AU - Hämälä, Tuomas
AU - Wafula, Eric K.
AU - Guiltinan, Mark J.
AU - Ralph, Paula E.
AU - dePamphilis, Claude W.
AU - Tiffin, Peter
N1 - Funding Information:
ACKNOWLEDGMENTS. We thank J. H. Marden, S. N. Maximova, D. Zhang, B. M. Tyler, and N. Winters for comments and discussions during the project, A. S. Fister and M. E. Leandro-Muñoz for sample collection, L. Landherr Shaeffer and M. G. Perryman for technical support, N. M. Springer and two anonymous reviewers for their comments on improving the manuscript, and Centro Agronómico Tropical de Investigación y Enseñanza (CATIE) for providing access to the cacao germplasm. Computational resources were provided by the Institute for Computational and Data Sciences at The Pennsylvania State University and the Minnesota Supercomputing Institute at the University of Minnesota. This work was supported by the NSF Grant IOS-1546863 and the US Department of Agriculture National Institute of Food and Agriculture, Federal Appropriations under Project PEN04569 and accession number 1003147.
Funding Information:
We thank J. H. Marden, S. N. Maximova, D. Zhang, B. M. Tyler, and N. Winters for comments and discussions during the project, A. S. Fister and M. E. Leandro-Muñoz for sample collection, L. Landherr Shaeffer and M. G. Perryman for technical support, N. M. Springer and two anonymous reviewers for their comments on improving the manuscript, and Centro Agronómico Tropical de Investigación y Enseñanza (CATIE) for providing access to the cacao germplasm. Computational resources were provided by the Institute for Computational and Data Sciences at The Pennsylvania State University and the Minnesota Supercomputing Institute at the University of Minnesota. This work was supported by the NSF Grant IOS-1546863 and the US Department of Agriculture National Institute of Food and Agriculture, Federal Appropriations under Project PEN04569 and accession number 1003147.
Publisher Copyright:
© 2021 National Academy of Sciences. All rights reserved.
PY - 2021/8/31
Y1 - 2021/8/31
N2 - Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies of Theobroma cacao-an outcrossing, long-lived tree species that is the source of chocolate-to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies in T. cacao.
AB - Genomic structural variants (SVs) can play important roles in adaptation and speciation. Yet the overall fitness effects of SVs are poorly understood, partly because accurate population-level identification of SVs requires multiple high-quality genome assemblies. Here, we use 31 chromosome-scale, haplotype-resolved genome assemblies of Theobroma cacao-an outcrossing, long-lived tree species that is the source of chocolate-to investigate the fitness consequences of SVs in natural populations. Among the 31 accessions, we find over 160,000 SVs, which together cover eight times more of the genome than single-nucleotide polymorphisms and short indels (125 versus 15 Mb). Our results indicate that a vast majority of these SVs are deleterious: they segregate at low frequencies and are depleted from functional regions of the genome. We show that SVs influence gene expression, which likely impairs gene function and contributes to the detrimental effects of SVs. We also provide empirical support for a theoretical prediction that SVs, particularly inversions, increase genetic load through the accumulation of deleterious nucleotide variants as a result of suppressed recombination. Despite the overall detrimental effects, we identify individual SVs bearing signatures of local adaptation, several of which are associated with genes differentially expressed between populations. Genes involved in pathogen resistance are strongly enriched among these candidates, highlighting the contribution of SVs to this important local adaptation trait. Beyond revealing empirical evidence for the evolutionary importance of SVs, these 31 de novo assemblies provide a valuable resource for genetic and breeding studies in T. cacao.
UR - http://www.scopus.com/inward/record.url?scp=85113398196&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113398196&partnerID=8YFLogxK
U2 - 10.1073/pnas.2102914118
DO - 10.1073/pnas.2102914118
M3 - Article
C2 - 34408075
AN - SCOPUS:85113398196
SN - 0027-8424
VL - 118
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 35
M1 - e2102914118
ER -