GTF File contains multiple gene ids for the same gene name?
1
1
Entering edit mode
7.4 years ago
alpanagi ▴ 10

I am going through the gencode mm10 annotation (Comprehensive gene annotation - PRI) and I'm seeing some genes that have the same name but different gene ids. The following to be specific:

[1] Sept2      Ccl27a     Ccl21b     Fam205a2   Il11ra2    Ccl19      Ccl21a     Jakmip1
[9] Ugt2a1     Gm3286     Btbd8      U2af1l4    Dlg2       Itgam      Map2k7     Raver1
[17] Olfr912    Rnf26      Lilrb4a    Sumo3      Gm2696     Adat3      Dohh       Gm3055
[25] Gm12057    Spata22    St6galnac2 Srp54a     Gm16381    Zfp935     Olfr190    Crybg3
[33] Pcdha11    Nudt8

Some of them are from different annotation sources (HAVANA, ENSEMBL) but others are from the same source. Many of them have loci close to each other but others (like Ccl27a) have loci that are not related.

What gives? Also how would I handle read counts associated with them? Should I just sum genes with the same name even though they have different gene ids?

Thanks!

alignment RNA-Seq gtf • 2.7k views
ADD COMMENT
1
Entering edit mode
7.4 years ago

I recommend you to use Ensembl or Havana ID's instead of Gene Names, because Gene Names aren't in a consensus, as Ensembl/Havana/Gencode did in the past which their ID's. I had'nt checked it out, but I read it on the web and I have been told in my job to use ENS Id's.

You could then give the ID's the correct Gene Name for searching, for example.

ADD COMMENT
0
Entering edit mode

You were lucky, I found the source:

"We recommend to use unique gene identifiers, such as NCBI Entrez gene identifiers, to cluster features into meta-features. Gene names are not recommended to use for this purpose because different genes may have the same names. Unique gene identifiers were often included in many publicly available GTF annotations which can be readily used for summarization."

from Rsubread manual, the bioconductor's package.

ADD REPLY

Login before adding your answer.

Traffic: 2085 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6