Hello,
GATK recommends the following filter expression for hard filtering of vcfs: "QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0"
The website does explain where the numbers from each condition come from. However, I'm still trying to understand why the recommended filter expression being "OR" instead of "AND"; i.e., why do we consider variants that are above "QD < 2.0 OR FS > 60.0 OR MQ < 40.0 etc" instead of "QD < 2.0 AND FS > 60.0 AND MQ < 40.0 etc" as variants that pass the filter?
Also, how important are DP and GQ when filtering for good variants? Is it necessary to do further filtering after what GATK recommends? Will this cause over-filtering?
I only have 5 samples, which is why I am not doing VQSR for filtering step. Thanks a lot for your help!
Thank you for the clarification! Using the OR option, variants that passes the filter will have QD > 2.0 and FS < 60.0 and MQ > 40.0 and MQRankSum > -12.5 and ReadPosRankSum > -8.0.