I'm trying to annotate variants with the ensembl Variant Effect Predictor and I'd prefer to get my variant annotation done with refseq transcripts but VEP does not seem to be doing this even when I explicitly mention this in the options.
For example the variant
1 100672060 rs12021720 T C
Clearly lies in the third exon of this refseq transcript:
2 NM_001918 chr1 - 100652477 100715409 100661810 100715376 11 100652477,100671785,100672000,100676249,100680372,100681538,100684181,100696288,100700991,100706316,100715325, 100661978,100671857,100672192,100676327,100680539,100681755,100684303,100696470,100701067,100706440,100715409, 0 DBT cmpl cmpl 0,0,0,0,1,0,1,2,1,0,0,
Yet VEP seems intent on giving me annotation only in terms of CCDS & Ensembl transcripts.
Uploaded Variation Location Allele Gene Feature Feature type Consequence Position in cDNA Position in CDS Position in protein Amino acid change Codon change Co-located Variation Extra
1_100672060_T/C 1:100672060 C CCDS767.1 CCDS767.1 Transcript missense_variant 1150 1150 384 S/G Agt/Ggt - -
1_100672060_T/C 1:100672060 C ENSESTG00000014700 ENSESTT00000036776 Transcript intron_variant - - - - - - -
1_100672060_T/C 1:100672060 C ENSESTG00000014716 ENSESTT00000036811 Transcript missense_variant 160 109 37 S/G Agt/Ggt - -
Any ideas on why this happens? I tried both the command line and online versions and I get the same result.
Command:
perl path_to_vep/vep/variant_effect_predictor.pl -i PathTo.vcf --offline --dir_cache Path_to_Vep/vep/ --fasta Path_to_Vep/vep/homo_sapiens/74/Homo_sapiens.GRCh37.74.dna.primary_assembly.fa -o vep_annotation.txt --hgvs --force_overwrite --sift b --polyphen b --maf_1kg --maf_esp --refseq --gmaf --protein --ccds
Can you paste the command here?
I added the command but it will likely not be of help because VEP annotations depend on which version of the annotation database you download. I used the refseq based database during installation. The results are reproducible from the online vep tool.
I thought you may be missing the "--refseq" flag. BTW, I found this post that can be relevant to your problem (http://lists.ensembl.org/pipermail/dev/2013-February/008400.html)
Thanks for the link, I read through that in my google searches and that's how I ended up including the --refseq flag. That still doesn't rectify my issue since I'm using the proper database. I decided to post here since I've seen someone from Ensembl respond to these questions on this forum but I'll likely have to e-mail the list serve to get a solution. VEP is a pretty comprehensive tool that gives me a lot of my annotation information except for this issue.
If nobody replies you can try contacting Emily (Emily_Ensembl) about the same. She should be able to figure out what is going wrong.
Clearly I have a reputation here. I'm going to pass this onto Will, who made and maintains the VEP, to see what he says.