Hi, I have a huge list of gene names, and I want to assign the ID for each gene. I tried the following code, and it kind of worked, but the problem was that it created extra IDs (more than the available genes) --> causing misalignment between names and IDs
CODE
library("org.Hs.eg.db")
a <- read.csv("filename.csv",TRUE,",")
y=as.character(a$Gene.refGene) #column name is Gene.refGene
gene=y
output = unlist(mget(x=gene,envir=org.Hs.egALIAS2EG,ifnotfound=NA))
write.csv(output, file = "output.csv") #write the IDs which is the output in a csv file
Is there any other way to get the gene IDs? or any suggestion on how to modify the code? your help is highly appreciated!
Some gene names have more IDs and vice versa. There is no way around it. One way to deal with it is to keep only one.