Entering edit mode
2.6 years ago
4galaxy77
2.9k
I've annotated an imputed dataset (~40m variants) with CADD scores from the CADD database and the associated ENSG (e.g. ENSG00000283761) and ENST IDs.
There are 55,092 unique ENSG IDs in my dataset. Given that I thought there was one per gene and that humans contain ~20,000 genes (give or take a few thousand), this is quite a lot more than I expected.
Why is there this number of IDs and do they correspond to unique genes?