Dear all,
I've been using snpEff and snpSift for a while to annotate and filter my VCF files. Usually I just annotate my VCF with snpEff but, as it is, snpEff annotates all transcripts. Recently I wanted to restrict my annotations (ANN) to a list of ~150 transcripts, but could not find a way to do it. Is it possible? As a workaround, I just re-annotated the VCF with snpEff the "-onlyTr" option, but it is annoying to have to annotate a VCF twice. Is there a way to just filter the annotations?
However, even with the "-onlyTr" option, I still get some annotations that should have been removed. In the example below, it annotates a variant on both MSH6 and FBXO11 genes, even though I only specified the transcript ENST00000234420. I guess it is because gene FBXO11 has no transcript, but if I am only interested on that list of transcripts, this gene should not be included, right?
thanks
2 48033890 . CT C 1846.7 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=1.216;ClippingRankSum=0.000;DP=498;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;POSITIVE_TRAIN_SITE;QD=4.44;ReadPosRankSum=-0.195;SOR=0.711;VQSLOD=4.16;culprit=SOR;ANN=C|intron_variant|MODIFIER|MSH6|ENSG00000116062|transcript|ENST00000234420|protein_coding|9/9|c.4002-10delT||||||INFO_REALIGN_3_PRIME,C|intragenic_variant|MODIFIER|FBXO11|ENSG00000138081|gene_variant|ENSG00000138081|||n.48033891delA|||||| GT:AD:DP:GQ:PL 0/1:254,162:416:99:1884,0,3658
Just a thought: maybe a custom GFF3 with just the gene/transcripts needed could be created and imported as a custom annotation database into snpEff. Then the annotation could be run against that database, which would probably save time, too.