I have some vcf file generated by using GATK Unified Genotyper (--output_mode EMIT_ALL_SITES), it looks like
chr15 21138918 . C . 46.23 . . GT:DP 0/0:6
chr15 21138919 . C . 46.23 . . GT:DP 0/0:6
chr15 21138920 . T . 46.23 . . GT:DP 0/0:6
chr15 21138921 . C . 46.23 . . GT:DP 0/0:6
chr15 21138922 . A . 46.23 . . GT:DP 0/0:6
chr15 21138923 . A . 46.23 . . GT:DP 0/0:6
chr15 21138924 rs116806596 C T 173.84 . . GT:AD:DP:GQ:PL 1/1:0,6:6:18:202,18,0 chr15 21138925 . T . 43.23 . . GT:DP 0/0:6
chr15 21138926 . G . 46.23 . . GT:DP 0/0:6
chr15 21170494 . T . 34.23 . . GT:DP 0/0:5
chr15 21170495 . G . 34.23 . . GT:DP 0/0:5
chr15 21170496 . A . 34.23 . . GT:DP 0/0:5
My question is: Is there bcftools or vcftools could simply filter this vcf file to only show SNPs (eg, chr15 21138924 rs116806596 C T 173.84 . . GT:AD:DP:GQ:PL 1/1:0,6:6:18:202,18,0)?
I know that I can rerun GATK Unified Genotyper by using EMIT_VARIANTS_ONLY, but I have so many files, it is very time consuming.
Thanks.
Will work with vcf too. Last lines from output:
It works! Thanks a lot.