TY - JOUR
T1 - Selscan
T2 - An efficient multithreaded program to perform EHH-based scans for positive selection
AU - Szpiech, Zachary A.
AU - Hernandez, Ryan D.
N1 - Publisher Copyright:
© 2014 The Author 2014.
PY - 2014/10/1
Y1 - 2014/10/1
N2 - Haplotype-based scans to detect natural selection are useful to identify recent or ongoing positive selection in genomes. As both real and simulated genomic data sets grow larger, spanning thousands of samples and millions of markers, there is a need for a fast and efficient implementation of these scans for general use. Here, we present selscan, an efficient multithreaded application that implements Extended Haplotype Homozygosity (EHH), Integrated Haplotype Score (iHS), and Cross-population EHH (XPEHH). selscan accepts phased genotypes in multiple formats, including TPED, and performs extremely well on both simulated and real data and over an order of magnitude faster than existing available implementations. It calculates iHS on chromosome 22 (22,147 loci) across 204 CEU haplotypes in 353 s on one thread (33 s on 16 threads) and calculates XPEHH for the same data relative to 210 YRI haplotypes in 578 s on one thread (52 s on 16 threads). Source code and binaries (Windows, OSX, and Linux) are available at https://github.com/szpiech/selscan.
AB - Haplotype-based scans to detect natural selection are useful to identify recent or ongoing positive selection in genomes. As both real and simulated genomic data sets grow larger, spanning thousands of samples and millions of markers, there is a need for a fast and efficient implementation of these scans for general use. Here, we present selscan, an efficient multithreaded application that implements Extended Haplotype Homozygosity (EHH), Integrated Haplotype Score (iHS), and Cross-population EHH (XPEHH). selscan accepts phased genotypes in multiple formats, including TPED, and performs extremely well on both simulated and real data and over an order of magnitude faster than existing available implementations. It calculates iHS on chromosome 22 (22,147 loci) across 204 CEU haplotypes in 353 s on one thread (33 s on 16 threads) and calculates XPEHH for the same data relative to 210 YRI haplotypes in 578 s on one thread (52 s on 16 threads). Source code and binaries (Windows, OSX, and Linux) are available at https://github.com/szpiech/selscan.
UR - http://www.scopus.com/inward/record.url?scp=84921569624&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84921569624&partnerID=8YFLogxK
U2 - 10.1093/molbev/msu211
DO - 10.1093/molbev/msu211
M3 - Article
C2 - 25015648
AN - SCOPUS:84921569624
SN - 0737-4038
VL - 31
SP - 2824
EP - 2827
JO - Molecular biology and evolution
JF - Molecular biology and evolution
IS - 10
ER -