I used GATK to make variant calling on an exome-seq data set that has 10 samples. I copied one line (for an SNP identified) from the VCF output as shown below. The genotype is indicated (for example 0/1 for heterozygous mutant). The reference and variant reads are shown immediately after the genotype 0/1 (for example).
My question is how do you filter the VCF by the coverage (ref reads + variant reads ?). What to do if some of the samples pass the filtration and other samples fail to pass?
0/1:2,10:12:28:256,0,28 0/1:13,18:31:99:427,0,315 0/1:6,9:15:99:246,0,155 0/1:8,8:16:99:176,0,187 0/0:8,0:8:24:0,24,259 0/1:5,6:11:99:144,0,136 0/1:5,5:10:99:103,0,110 0/1:4,6:10:99:161,0,103 0/0:10,0:10:30:0,30,277 0/1:16,7:23:99:192,0,529
Hi Pierre, Could you please inform me how i can filter my multi-sample vcf file (250 samples) for only variants with at five homozygous REF and five homozygous ALT call using VCFFilterJS. Thank you !
this is a new question, please open this as a new question: https://www.biostars.org/p/new/post/
Could you please inform me how i can filter my multi-sample vcf (250 samples) file for only variants with at least five homozygous REF and five homozygous ALT call using VCFFilterJS.
again, I'll give you the answer, but please, askthis a different question ( https://www.biostars.org/p/new/post/ ) (it's not How To Filter Vcf By Coverage? ). Ask your question as a NEW question. So everybody can contribute and follow the new thread. Thanks.