Hello,
There is an annotation table for corn / maize (Zea mays) at ensembl, accessible via biomaRt.
What I would do is to first pull a complete annotation table from ensembl, which can actually be quciker than doing specific lookups:
require(biomaRt)
mart <- useMart('plants_mart', 'zmays_eg_gene',
host = 'https://plants.ensembl.org')
annot <- getBM(
attributes = c('ensembl_gene_id', 'entrezgene_id', 'gene_biotype'),
mart = mart)
head(annot)
ensembl_gene_id entrezgene_id gene_biotype
1 Zm00001eb442760 NA misc_non_coding
2 Zm00001eb393960 NA misc_non_coding
3 Zm00001eb113450 NA misc_non_coding
4 Zm00001eb437000 NA misc_non_coding
5 Zm00001eb441340 NA misc_non_coding
6 Zm00001eb437720 NA misc_non_coding
head(annot[!is.na(annot$entrezgene_id),])
ensembl_gene_id entrezgene_id gene_biotype
4542 Zm00001eb321680 100502366 protein_coding
4543 Zm00001eb323640 100501883 protein_coding
4545 Zm00001eb080260 100381510 protein_coding
4547 Zm00001eb281360 100275601 protein_coding
4551 Zm00001eb155150 542087 protein_coding
4552 Zm00001eb144800 100277214 protein_coding
Then, you can do a simple lookup locally like this:
```r
lookup <- data.frame(genes = c('Zm00001eb000370', 'Zm00001eb000450'))
merge(
x = as.data.frame(lookup),
y = annot,
by.y = 'ensembl_gene_id',
all.x = TRUE,
by.x = 'genes')
genes entrezgene_id gene_biotype
1 Zm00001eb000370 103630483 protein_coding
2 Zm00001eb000450 100285831 protein_coding
Using your own diff_genes
variable, this could be run as:
merge(
x = as.data.frame(diff_genes),
y = annot,
by.y = 'ensembl_gene_id',
all.x = TRUE,
by.x = 'genes')
You can check for further attributes that you may want to retrieve from ensembl via: listAttributes(mart)
Kevin
Kevin, thank you very much for your valuable help. I am using this output for a KEGG analysis, although I recover very few genes for my analysis (6) of the more than 1000 that I enter. So I was wondering if you could guide me on how to do a GO enrichment analysis. I was trying with clusterProfiler but there is no support for corn in organism = "org.XXX.eg.db", could you guide me?
I will greatly appreciate your help.
This script has worked for GO enrichment in arabidopsis but I have not been able to adapt it for maize: