Dear Biostars,
To clarify this question is about best practices with trio-samples (proband, mother, father).
I have been able to remove low confidence variants from .vcf.gz
files via:
bcftools filter -Oz -i'FMT/DP>10' ${path}/x.vcf.gz > ${path}/x_filt.vcf.gz
This removes variants where the DP is less than 10 all three samples.
However, this keeps variants where one sample has more than 10 reads, and the others do not. Below are two lines from x_filt.vcf.gz
:
Format Pband Mother Father
GT:AD:DP:GQ:PL 0/0:5,0:5:15:0,15,209 0/1:9,12:21:99:438,0,342 0/0:2,0:2:6:0,6,84
GT:AD:DP:GQ:PL 0/0:6,0:6:15:0,15,225 0/1:8,5:13:99:186,0,300 1/1:0,2:2:6:90,6,0
The DP is <10 in the Proband and Father for both variants, but passes the filtration due to the Mother having DP >10.
How can I remove cases such as these? Preferably using BCFtools, however other suggestions are welcome.
Many Thanks, Krutik
can you try filter by
MIN(DP)>10