This is the first time I used Variant Effect Predictor (VEP) and would like to use it to annotate the VCF files I got from WES data. I tried to set up some filters to include only the mutations with mutant allele frequency higher than 0.2 (Number of mutations/Total number of counts > 0.2).
This is the code I used:
./vep --cache --offline --symbol --coding_only \
--freq_freq 0.2 --freq_gt_lt gt --freq_filter include \
-i input.vcf -o output.txt
I checked the results by loading the bam files on IGV. However, I found that so far almost all the mutations in the results had allele frequency < 0.2. For example:
Total counts: 118
A: 0
C: 0
G: 102 (86%, 86+, 16-)
T: 16 (14%, 16+, 0-)
N: 0
The G -> T mutation has only 0.14.
Does anyone have experience in using VEP? The way I used it may be incorrect and could you point out what I am missing here? Thank you.
Here are a few lines of VCF:
1 69428 . T G . . . AD:DP:n.read.pos:n.read.pos.ref:raw.count:raw.count.ref:raw.count.total:mean.quality:count.plus:count.plus.ref:count.minus:count.minus.ref:read.pos.mean:read.pos.var:codon.dir 0,2:2:2:0:2:0:2:35.5:2:0:0:0:34.5:12.5:0
1 69511 . A G . . . AD:DP:n.read.pos:n.read.pos.ref:raw.count:raw.count.ref:raw.count.total:mean.quality:count.plus:count.plus.ref:count.minus:count.minus.ref:read.pos.mean:read.pos.var:codon.dir 0,2:2:2:0:2:0:2:37.5:0:0:2:0:40:2:0
1 183629 . G A . . . AD:DP:n.read.pos:n.read.pos.ref:raw.count:raw.count.ref:raw.count.total:mean.quality:mean.quality.ref:count.plus:count.plus.ref:count.minus:count.minus.ref:read.pos.mean:read.pos.mean.ref:read.pos.var:read.pos.var.ref:codon.dir 14,6:20:6:13:6:14:20:37.5:36.8571:6:13:0:1:32.1667:28.6429:527.506:431.971:0
I wanted to filter the mutant allele frequency based on the data that I have (in-house frequency) (Number of counts that has that mutation is divided by total number of counts in bam file). Not to filter the allele frequency based on the data on 1000 Genome. I wonder if VEP can allow me to do this?
VEP assumes standard VCF when filtering standard fields such as AF. Unless the source file has AF in standard format, it won't work.