Question

Why is there 55,092 unique ensembl ENSG IDs?

0

Entering edit mode

2.6 years ago

4galaxy77 2.9k

I've annotated an imputed dataset (~40m variants) with CADD scores from the CADD database and the associated ENSG (e.g. ENSG00000283761) and ENST IDs.

There are 55,092 unique ENSG IDs in my dataset. Given that I thought there was one per gene and that humans contain ~20,000 genes (give or take a few thousand), this is quite a lot more than I expected.

Why is there this number of IDs and do they correspond to unique genes?

gene ensembl • 479 views

ADD COMMENT • link updated 2.6 years ago by GenoMax 147k • written 2.6 years ago by 4galaxy77 2.9k

score 5 · Accepted Answer · 2022-04-11

5

Entering edit mode

2.6 years ago

GenoMax 147k

There are ~22K protein coding genes. This entire list contains all sorts of other things like "pseudogenes" etc.Here is a summary from BioMart.

list

Here are other types

Other types

ADD COMMENT • link 2.6 years ago by GenoMax 147k