GSEA RNASeq

0

Entering edit mode

2.6 years ago

Rob ▴ 170

Hi friends For gene set enrichment analysis (GSEA), the software from broad institute does not accept ensemble IDs, I want to do the analysis using entrez ID or hugo ID but about 2000 genes don't have hugo ID or entrez ID.

What should I do?

gene enrichment analysis set rna-seq • 924 views

ADD COMMENT • link updated 2.6 years ago by Carlo Yague 8.9k • written 2.6 years ago by Rob ▴ 170

0

Entering edit mode

To lose genes after ID conversion, although annoying, is normal to some extent. But 2000 is a lot. How did you perform the conversion ? What genome is it (species and version) ?

ADD REPLY • link 2.6 years ago by Carlo Yague 8.9k

0

Entering edit mode

Hi thanks for responding. It is human all 20000 coding genes. about 2000 is missing after ID conversion. I do the ID conversion using biomart package in R.

ADD REPLY • link 2.6 years ago by Rob ▴ 170

0

Entering edit mode

I see. A few thoughts:

Biomart is a great tool for ID conversion so it is probably not the problem.
I'm not sure if it will change anything, but perhaps you can try to remove the version number from the ensembl IDs (ENSG00000010404.10 becomes ENSG00000010404) for the 2000 unconverted IDs and see if it improves the conversion rate.
Finaly, as I said, it is normal to lose genes after conversion so don't panic: there is just not always a 1-1 relationship between Ensembl and ENTREZ IDs.

ADD REPLY • link 2.6 years ago by Carlo Yague 8.9k

Login before adding your answer.