Great, so, you can just use the Escherichia coli K-12 org.db that is in Bioconductor:
library(org.EcK12.eg.db)
genes <- c('glnD','eutC','mscM','tatA','elaB','fliI','sapD','ppnP',
'ybhA','ilvE','tatE','oppC')
mapIds(org.EcK12.eg.db, keys = genes,
column = 'ENTREZID', keytype = 'SYMBOL')
glnD eutC mscM tatA elaB fliI sapD ppnP
"944863" "946925" "948676" "948321" "946751" "946457" "946203" "945048"
ybhA ilvE tatE oppC
"945372" "948278" "945228" "945810"
With the Entrez IDs mapped, we can look up other stuff (and you can use clusterProfiler):
genes_entrez <- mapIds(org.EcK12.eg.db, keys = genes,
column = 'ENTREZID', keytype = 'SYMBOL')
keytypes(org.EcK12.eg.db)
[1] "ACCNUM" "ALIAS" "ENTREZID" "ENZYME" "EVIDENCE"
[6] "EVIDENCEALL" "GENENAME" "GO" "GOALL" "ONTOLOGY"
[11] "ONTOLOGYALL" "PATH" "PMID" "REFSEQ" "SYMBOL"
annotTable <- select(org.EcK12.eg.db, keys = genes_entrez,
columns = c('ENTREZID', 'ALIAS', 'ENZYME', 'SYMBOL', 'GENENAME', 'PATH'))
head(annotTable)
ENTREZID ALIAS ENZYME SYMBOL
1 944863 ECK0165 2.7.7.59 glnD
2 944863 glnD5 2.7.7.59 glnD
3 944863 glnD 2.7.7.59 glnD
4 946925 ECK2435 4.3.1.7 eutC
5 946925 eutC 4.3.1.7 eutC
6 948676 ECK4155 <NA> mscM
GENENAME PATH
1 PII uridylyltransferase/uridylyl removing enzyme 02020
2 PII uridylyltransferase/uridylyl removing enzyme 02020
3 PII uridylyltransferase/uridylyl removing enzyme 02020
4 ethanolamine ammonia-lyase subunit beta 00564
5 ethanolamine ammonia-lyase subunit beta 00564
6 miniconductance mechanosensitive channel MscM <NA>
Kevin
Please provide some sample input data.
For each set (X1 to X4) is the gene and the corresponding logFC. For example, this is an example of my dataset if I take just a few samples from it:
X1
X2
X3
X4