Entering edit mode
16 months ago
dylannicoembros
•
0
Good evening,
I have a dds
object with gene id's, and I need to convert them into gene symbols. The point is that some genes do not have a match and I don't want to lose them in the analysis. I used this procedure using Biomart:
listEnsembl()
ensembl <- useEnsembl(biomart = "genes")
datasets <- listDatasets(ensembl)
ensembl.con <- useMart("ensembl", dataset = 'hsapiens_gene_ensembl')
attr <- listAttributes(ensembl.con)
filters <- listFilters(ensembl.con)
t <- getBM(attributes = c('ensembl_gene_id','external_gene_name'),
filters = "ensembl_gene_id",
values = ensembl.ids$V1,
mart = ensembl.con)
rownames(dds) <- t$external_gene_name[match(rownames(dds), t$ensembl_gene_id)]
But this creates NA
values if a gene does not have a match. I need to preserve gene_id's
in the rows when there is no correspondence. So I will have some gene symbols and some gene id's in the rows where the symbols is not provided from Biomart (or does not exist at all for some genes).
How can I do that ?
Thank you for your time.
Have you tried using a simple
ifelse
after processing thegetBM
result so you're evaluating theexternal_gene_name
column and using theensembl_gene_id
column as the NA replacement?I am a beginner in R and I am a bit stuck. I tried with this command, but the problem is that
getBM()
does not return anything if the match does not happen.Do not use a variable named
t
-t
is a popular function in R and you'll end up having to use its fully qualified name if you need it, plus your code will confuse people.You're on the right track. Essentially, you'll need to be sure that the result from getBM is in a data.frame. You can then add a new column (NOT rownames) based on the ifelse. Also, you need to check if
t$external_gene_name
is NA. Try this:Thanks for the reply, I am doing the assignment because I have a dds object from deseq2, and on the object I have
rownames
asgene_id's
. So I basically need to convert them and leave onlygene symbols
as rownames. I came up with a partial solution, which still gives me some other problem:rownames(dds) <- ifelse(rownames(dds) %in% t_obj$ensembl_gene_id, t_obj$external_gene_name,rownames(dds))
.That won't work. Try the solution I gave you.
ifelse
does not vectorize the way you're assuming it will.