Hello,
I am fairly new to bioinformatics and I'm stuck on how to go about this. I have been trying to prune snps based on LD from a large vcf file. I have mainly followed this really nice tutorial on how to do this in plink: https://evomics.org/learning/population-and-speciation-genomics/2016-population-and-speciation-genomics/fileformats-vcftools-plink/ However, when I have reverted my plink binary files back to vcf format, I am excluding the INFO and QUAL data in the vcf file. I had also looked into VCFtools to do this using:
vcftools --vcf <original vcf file> --snps snps_ld_0.8.prune.in --recode --recode-INFO-all --out <new vcf>
But I end up with a blank vcf file or with only the header metadata. During the tutorial (linked above) it makes me create a chromosome map that generates id's for all the snps - do I need to add these to my original vcf file (which only has '.' in the ID columns), if so can anyone recommend how I would go about this? Thanks for your help.