I want to make a .vcf.gz from every individual in the 1000 genomes data. So I've downloaded all the .vcf.gz for all the chromosomes. I merged all the chromosomes into one big .vcf.gz. Now I want to create for every individual a separate .vcf
I normally use GATKSelectVariants to do that. However you also need to specify a human reference genome when using this GATK option & I think that's where I created a problem. Since all the single sample .vcf.gz come out empty (except for the header). Is there another option besides GATK? I used vcf-tools for other purposes before, but I noticed that this sometimes makes mistakes in the allele frequency when it splits a multi-sample file. Or which reference genome should I use if I want to make GATKSelectVariants work for the 1000genomes data?
sure all the INFO column values that relate to all samples should be recalculated in order to be informative.
by the way, this should be a comment, not an answer.