Why does 1 variant produce 3 genes as a result?
1
0
Entering edit mode
4 months ago
mgranada3 ▴ 50

I obtained my .VCF file after GATK by using SnpEff. The entire ID column has a period, I am confused on whether a gene name belongs here. When I use the Ensembl Variant Effect Predictor, multiple genes are reflected PER variant.

For example, there is only one hit to Chromosome A, position 251395. When running my data through the variant effect predictor, 3 genes for this chromosome appear. I am confused about what this mean to my data, as none of these genes appear in my SnpEff gene text file for the same sample.

VCF File:

enter image description here

Ensembl Variant Effect Predictor:

enter image description here

SnpEff gene text file:

enter image description here

GATK SnpEff • 401 views
ADD COMMENT
0
Entering edit mode

The definitions of the annotations are located here. An upstream or downstream variant indicates the variant is 5' or 3' of those candidates, respectively. These variants can have implications on expression, but do not change the AA sequence of either transcript. The stop_gained is more obvious and has more significant implications.

Regarding the missing data among tools. Can you check the VCF file for the affected samples and see if the variant exists for the stop_gained? My guess with the upstream/downstream variants, the distance cutoff defaults are different among tools.

ADD REPLY
0
Entering edit mode
3 months ago

This can be explained by the fact that there may be multiple genes transcribed across the same genomic locus - each would have a different TSS and/or promoter, resulting in the transcription of a different gene. It can also be explained by each gene having multiple transcript isoforms.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 2770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6