Understanding Variant Effect Predictor results
1
0
Entering edit mode
3.9 years ago
caro-ca ▴ 20

I am studying the effect of transposable elements in Saccharomyces cerevisiae populations. The outcome I have from Variant Effect Predictor is as follow:

    Category    Count
   Variants processed   414
   Overlapped genes 1553
   Overlapped transcripts   1553
   Overlapped regulatory features   -

What is the difference between overlapped genes and transcripts? If a transposon is overlapping a gene then I might not have a transcript at all or I can get a different transcript, depends where it is located. On the other hand, from literature, I know transposon can interrupt regulatory elements, does your database have annotations in yeast?

For all consequences predicted:

Upstream gene variant   49
Downstream gene variant 42
Intergenic variant  4
Transcript ablation 3
Coding sequence variant 1
Feature elongation  1
3' UTR variant  1

From above 1553 genes were overlapping to a transposon, but here how so many genes can be affected with the results above?

And finally, the information for the consequences on a protein sequence:

Stop codon lost 16
Coding sequence variant 84

Above coding sequence variants represent 1%, how is it possible that here the consequence is 84%?

I hope you could help me out. Thank you in advance for your time.

ensembl vep variant effect predictor • 1.4k views
ADD COMMENT
0
Entering edit mode

I am still trying to understand the results. S. cerevisiae has ~6000 genes and according to the summary statistics, there are 1553 genes overlapping with a transposon sequence. This is approximately 25 % of genes been affected by transposons. How can there be so many genes being affected if the majority of the impact is in the up/downstream region of a gene? How can there be so many genes being affected by just 414 variants?

I really hope you could help me out. Thank you in advance.

ADD REPLY
1
Entering edit mode

The effect on the gene is that there is an up/downstream gene variant, ie that there is a gene within 5kb of the variant. This means that (49 + 42)% of the 1553 genes listed as being affected, ie 1413 genes, have a variant in the 5kb up/downstream of them. That is all that it means.

ADD REPLY
0
Entering edit mode

Additionally, my VCF input file had annotated 779 variants, but in your summary table depicted 414 variants processed. How does it work? I thought these two values were supposed to be the same.

ADD REPLY
0
Entering edit mode

It's possible that some of them failed. To find out more you'd need to send your list to helpdesk@ensembl.org.

ADD REPLY
0
Entering edit mode

Thank you, I will send an email. On the other hand, I was looking at the position of an affected gene when a deletion occurred in a transposable element (TE) in your genome browser and at the same time, I am using IGV. In your genome browser, how can I see my deletion? I can see the coding sequence variant and I noticed that in blue you label the TEs but I assume these are from the reference genome in the Saccharomyces Genome Database (SGD).

ADD REPLY
1
Entering edit mode
3.8 years ago
Emily 24k
  1. A gene may have multiple transcripts and a variant may not overlap all transcripts of a gene. That is why there are two counts and they are often different.

  2. The first pie chart shows you what % of all the consequences. The second pie chart shows you, of that 1%, what the divide is in there.

ADD COMMENT
0
Entering edit mode

Thank you for your response, and do you have annotations for regulatory elements in yeast?

ADD REPLY
0
Entering edit mode

No, only human and mouse.

ADD REPLY

Login before adding your answer.

Traffic: 1534 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6