Why snpEff annotation give different names to one same gene?
1
0
Entering edit mode
6.0 years ago
MatthewP ★ 1.4k

Hello, I annotate my vcf with snpEff command like:

java -Xmx6g -jar snpEff.jar -c configfile GRCH38.86 > result.vcf

I checked ANN information at INFO field in result.vcf and found gene ND5 has 2 name( ND5 or MT-ND5 ) like:

C|upstream gene variant|MODIFIER|ND5|ND5|transcript|TRANSCRIPT ND5|protein coding||c.-4476T>C|||||4476|WARNING TRANSCRIPT MULTIPLE STOP COD

or

A|upstream gene variant|MODIFIER|MT-ND5|ENSG00000198786|transcript|ENST00000361567.2|protein coding||c.-4484G>A||

Why is that? Do I need to keep only one gene name?

snpEff mtDNA • 1.8k views
ADD COMMENT
0
Entering edit mode

MT probably for mitochondrial. Probably annotation is about the effects on normal gene and mitochondrial gene.

ADD REPLY
0
Entering edit mode

They are the same gene - MT-ND5 is the official symbol, ND5 is an old alias.

Many sources are slow to update gene symbols, which change relatively frequently. Go by the gene or transcript ID whenever possible and use them to grab Gene Symbols for creating final tables, etc.

ADD REPLY
1
Entering edit mode
6.0 years ago

It's likely due to alternative transcripts of the gene, since as you said, both of these are the same gene. I don't know all the annotation sources that snpEff uses, but the differences in the gene name/ID could be due to that. As to whether to keep it or not, that depends on if you're interested in the impact on potential isoforms or non-canonical transcripts. Is there any particular reason you don't want to just keep both of them?

ADD COMMENT
0
Entering edit mode

Thanks. The annotation source is GRCH38.86 build-in database from snpEff. I don't have to only keep one of them, I just confusing.

ADD REPLY

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6