Multiple ensemble ids
2
0
Entering edit mode
6.6 years ago

Hi everyone

I have sample wise data from TCGA for TNBC which contains approx. 60000 ensemble ids for each patient's sample like ENSG00000242268.2, ENSG00000270112.3, ENSG00000167578.15, ENSG00000273842.1, ENSG00000078237.5, ENSG00000146083.10 and so on. What these ids belong to? If they are different transcripts, then they must start as "ENST" Can anyone suggest?

Ensemble_ID • 2.1k views
ADD COMMENT
0
Entering edit mode

Thanks Devon and fin swimmer. If these are gene ids, then I got 60483 such ids in one patient sample, and that too unique. Also what about the digit after decimal place, I think it is ensemble version, but then how to convert them to gene id?.

ADD REPLY
0
Entering edit mode

But the 60483 ids aren't uniq, aren't they?Or have some the same ID but different version?

What do you mean by "convert them to gene id"?

EDIT: I took a look at the statistic site of the current ensembl release. If I sum up all known genes including pseudogenes and non coding genes in primary assembly and alternative sequence I came up to 60327 genes. That's nearly to your number. Could this be?

ADD REPLY
0
Entering edit mode

shivangi.agarwal800 : Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY
0
Entering edit mode
6.6 years ago

Those are gene IDs, you can use Ensembl Biomart if you'd like to get the mapping of them to normal gene identifiers.

ADD COMMENT
0
Entering edit mode
6.6 years ago

Hello,

IDs starting with "ENSG" are ensembl's stable ID for the gene and not for the transcript.

fin swimmer

EDIT: Devon was faster :)

ADD COMMENT
0
Entering edit mode

I don't proof read, so that saved me a couple minutes :)

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6