Entering edit mode
10.1 years ago
tiffy
▴
30
I have a set of phased (Beagle) SNPs for one chromosome in VCF format and would like to use selscan to detect selection. Selscan requires a hap-file and a map-file as input. Thus, I have tried to use vcftools --plink
to generate a map-file from the VCF and vcftools --IMPUTE
to generate a hap-file (although as far as I can see, selscan seems to require this matrix to be transposed first before it accepts it as input) but I was wondering whether there is any better way to easily convert my VCF-file?
Just to mention, selscan can also take .tped formatted files (in addition to .hap/.map), and I hope to have direct VCF support implemented soon.
-Zach
Basic VCF support has been added as of 06MAY2015.
Sorry for the plug, we have recently developed a selection pipeline that works straight from VCF
https://github.com/smilefreak/selectionTools.
It uses another R package Rehh to perform the ihs calculation and will annotate the ancestral and derived allele states which are needed for iHS calculation. Although, you will need to obtain an ancestral fasta sequence.
Thank you for sharing this. Unfortunately, for most of the data I am working with obtaining an ancestral fasta sequence will be a problem and thus I wanted to use selscan as I read that it "is 'dumb' with respect ancestral/derived coding".
Could you perform an alignment of some related species and take a consensus of alleles for positions that match up?
Using a program such as http://last.cbrc.jp/, this was done for a paper in Arabidopsis thaliana if I remember correctly.
But If you really cannot obtain one but you have >1 population in you data, the XP-EHH statistic, because it is a ratio, remains unaffected and so can be really informative.