I followed hard filtering method to filter variants in the Illumina whole exome dataset. I have a number of variants with MQ 40. One of my colleague suggested to look at the read support and read quality of the variants in IGV and decide whether to include or exclude the variants with MQ40. But, I am not sure which parameters to look for. Can someone give suggestions.
Mapping quality is assigned by the aligner itself, and they are different from one aligner to another. You can annotate the variants using VEP, and then you can include variants with a depth greater than 10 and Minor Allele Frequency < 0.01 (0.1%). After that, you can, for instance, look for mutations that are 'probably_damaging' or 'possibly_damaging', or you can check for mutations in specific genes, etc.
Thanks. I annotated with ANNOVAR tool and filtered variants less than 0.1% and then filtered for exonic variants and nonsynoymous variants. In IGV, I checked the unfiltered bamout and vcf files for quality, depth, GQ and mapping quality. The unfiltered files had Mapping quality of 60. But, the hard filtered vcf files had MQ less than 40, is this expected?. I also did variant evaluation https://gatk.broadinstitute.org/hc/en-us/articles/360040507171-VariantEval-BETA- and had sensitivity of 40%. Exome data set was from 6 subjects, so you expect low sensitivity value?
Thanks. I annotated with ANNOVAR tool and filtered variants less than 0.1% and then filtered for exonic variants and nonsynoymous variants. In IGV, I checked the unfiltered bamout and vcf files for quality, depth, GQ and mapping quality. The unfiltered files had Mapping quality of 60. But, the hard filtered vcf files had MQ less than 40, is this expected?. I also did variant evaluation https://gatk.broadinstitute.org/hc/en-us/articles/360040507171-VariantEval-BETA- and had sensitivity of 40%. Exome data set was from 6 subjects, so you expect low sensitivity value?