Hello everyone,
I have a vcf-file contains nearly 11millions SNPs. I want to convert my vcf file into 012 genotype matrix for LD pruning. I am using this code:
/data/programs/vcftools_0.1.13/bin/vcftools --vcf my.file.vcf
--012 --out output_geno.vcf
So, I get the output, but I am confused. According to manual the output 012 genotype matrix rows are individuals and columns are genotypes. I have 11million SNPs, should not get 11million columns (one columns per SNP)? when I count number of columns it is only nearly one million! Is there anything wrong or am I doing a ridiculous mistake?
Thanks for any help to figure out my mistake
Did you check the *.indiv and *.pos files that are also output with the --012 parameter? The *.indiv file should obviously cotain the expected number of samples that were in the input VCF.
Also, check the log file that's produced, particularly the line:
Kevin
How did you count the columns?
If you do something like this:
Do you have the right number of columns?