Hello all!
I am currently having trouble to convert a .vcf file to a bayescan input that takes into account the population information. I have a dataset of 34 individuals divided into 10 populations and over 34K SNPs to analyze. I have tried converting a .vcf file (generated through the STACKS program) to a bayescan input file using PDGspider. Unfortunately, while the conversion is successful, PDGspider seems to ignore the population information I have provided in the SPID file (e.g. a simple text file containing two columns, one with individual information and another with population information). The conversion generates a file considering a single population only, and - apparently - does not distinguish individual information.
I am assuming I have the wrong idea about how to make a population info file or I could be missing some necessary steps in between the VCF -> Bayescan conversion. I have seen other posts here and elsewhere ( I do not remember if it was on ResearchGate or google groups) suggesting A three-step conversion, like VCF -> PGD -> bayescan, but it did not work for me.
Any thoughts? Help would be greatly appreciated!
Very best, Leonardo.
Not sure if this is a good solution, but it is one that works. Assuming you performed a De novo analysis, you can change the quality scores in your vcf from "." to some arbitrary value and using the python script available here. You can do something along the lines of
awk -F '/t/' '{ if ($6 == ".") $6=20; print}' Old.vcf > New.vcf