I am conducting RNA-seq data analysis using a unique rodent species, Apodemus agrarius.
For Gene Set Analysis (GSA), I downloaded the Gene Ontology (GO) data for mouse from the MSigDB. However, when I performed the analysis, I found that the overlap was quite low when using the mouse GO database.
After I changed the DB to GOhuman, the plot has been changed as below (It seems proper for the anlysis)
After switching to the human GO database, the overlap significantly improved, as shown in the following plot. It seems more appropriate for my analysis.
Current Question:
Since the gene symbols in the original data come from Ensembl IDs, and the GTF file I used for annotation is for Apodemus agrarius, I suspect that the gene symbols may have been annotated based on human data.
As of now, there is only one GTF file available for Apodemus agrarius. Given this, I am uncertain about how to proceed with GSA and GSEA:
Does it make sense to use the human GO database for functional analysis?
I am unsure whether the gene symbols for humans and rodents are the same for the same genes. If not, should I use Ensembl IDs or Entrez IDs for comparison instead?
In that case, what R package or method would be suitable for converting gene IDs with updated data? I would appreciate any guidance on how to proceed with the analysis.
I really appreciate any comments regarding to this matter.