How to get the number of distinct variants genotyped from VCF or bed/bim files (GWAS)?
2
0
Entering edit mode
2.1 years ago
Qingyang Xiao ▴ 160

Hi,

How can I get the number of distinct variants genotyped from VCF or bed/bim files?

The reason why I want to do this is to compare the number of gentoyped variants of different batches but the same array.

Thanks!

Plink bed GWAS VCF • 667 views
ADD COMMENT
0
Entering edit mode
2.1 years ago
ffredew • 0

See bcftools for great tools to manipulate vcf files.

In bash, you can simply try :

grep -v "#" vcf_file.vcf | cut -f3 | sort | uniq | wc -l 

If vcf file is gzipped:

zcat vcf_file.vcf.gz | grep -v "#" | cut -f3 | sort | uniq | wc -l 
ADD COMMENT
0
Entering edit mode
2.1 years ago

is the file is indexed

bcftools index -s indexed.vcf.gz | cut -f3 | paste -sd '+' |bc

if the file isn't indexed

bcftools query -f '.' file.vcf.gz |wc -c
ADD COMMENT

Login before adding your answer.

Traffic: 1199 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6