VEP tab/vcf - Different output
1
0
Entering edit mode
2.7 years ago
Filago ▴ 100

Hello,

I am using VEP to annotate a VCF file. I am generating tab and vcf output. However, when looking at the tab output, I find mutations, that are not present in the vcf output (and also not in the input vcf file). Those mutations show no annotated reference allele, but just a "-" like e.g.:

1_899937_-/TCCGCA

Any ideas about how those mutations were created?

Many thanks in advance!

VEP ensembl • 3.1k views
ADD COMMENT
0
Entering edit mode

I have a similar question on those lines. Basically I have two variants which are consecutive within the same codon. Do you know how I get vep to translate the codon with the variants recognised as a delins as opposed to as 2 separate SNVs (which are currently being treated as if they were independent of each other)

e.g. A>C A>T as delinsCT in the same codon as opposed to A>C and A>T individually

TIA

ADD REPLY
0
Entering edit mode

Open a new question. Don't add answers unless you're answering the top level question.

ADD REPLY
2
Entering edit mode
2.7 years ago
finster ▴ 90

Hi,

I think you are running into the different ways that Ensembl and VCF files represent indels. If you take a look at the VCF input formats from the VEP documentation. Specifically,

Users using VCF should note a peculiarity in the difference between how Ensembl and VCF describe unbalanced variants. For any unbalanced variant (i.e. insertion, deletion or unbalanced substitution), the VCF specification requires that the base immediately before the variant should be included in both the reference and variant alleles. This also affects the reported position i.e. the reported position will be one base before the actual site of the variant.

In order to parse this correctly, VEP needs to convert such variants into Ensembl-type coordinates, and it does this by removing the additional base and adjusting the coordinates accordingly. This means that if an identifier is not supplied for a variant (in the 3rd column of the VCF), then the identifier constructed and the position reported in VEP's output file will differ from the input.

If this is not the case a small example file might be useful.

ADD COMMENT
0
Entering edit mode

Many thanks for your quick and helpful answer!

ADD REPLY
0
Entering edit mode

A small educational note: if an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they work. This will help future users that might find this post find the right answer.

upvote_bookmark_accept

ADD REPLY
0
Entering edit mode

is there any way to report the VCF-style ref and alt in the VEP tabular output?

ADD REPLY
1
Entering edit mode

I don't think so, no. EnsEMBL seems to have come up with a verbose explanation for why they don't want to put in the effort.

ADD REPLY
0
Entering edit mode

You can use the --show_ref_allele flag when running the VEP query. This will add the reference allele in the output and is mainly useful for the VEP "default" and tab-delimited output formats: https://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#output

ADD REPLY
0
Entering edit mode

I don't think that is in VCF-style coordinates which show deletions like chr1:99:AG -> A instead of chr1:100:G -> -

ADD REPLY
1
Entering edit mode

Ah, I see. Then, no, I don't believe that this is possible with the VEP

ADD REPLY

Login before adding your answer.

Traffic: 1154 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6