Converting mouse Gene IDs to Human while keeping genes that don't convert
0
0
Entering edit mode
3.4 years ago
oludhe ▴ 90

Hi there,

I am using bioMart to convert some gene IDs from mouse to human for some data I generated through RNA-seq. I am currently mapping using the following function:

  convertMouseGeneList <- function(x){
  require("biomaRt")
  human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
  mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")
  genesV2 = getLDS(attributes = c("mgi_symbol"), filters = "mgi_symbol", values = x , mart = mouse, attributesL = c("hgnc_symbol"), martL = human, uniqueRows=T)
  humanx <- unique(genesV2[, 2])
  # Print the first 6 genes found to the screen
  print(head(humanx))
  return(humanx)
}

This works, but doesn't map everything. I currently have a dataframe of 53569 genes, and I want to map as many of the mouse to human genes (I want to put this through a bulk deconvolution package that has a human dataset). So I am currently pulling out the genes from the dataframe into a list, and attempting to convert that. However, only 18411 genes are returned. I would like to replace these genes with their orthologs, but keep the other genes in that same dataframe, how would I do that here?

Alternatively, I could also create a new dataframe with only the mapped genes, but I would like to map it to the original genes so that I can retain the expression counts from the samples for the right gene that has been mapped from mouse to human. Any ideas on how I can achieve that?

Thanks!

Tom

bioMart RNA-seq Bioconductor R • 3.5k views
ADD COMMENT
0
Entering edit mode

You should provide an example of what your expected output would be. It is extremely unclear from your post which pieces of data you want to keep in your final dataframe.

You can likely use functions such as subset() or the %in% operator to create the right data.frames.

ADD REPLY
0
Entering edit mode

What I want to do is take the list of mouse genes e.g [mousegene1, mousegene2, mousegene3, mousegene4, mousegene5] and map any of them to human genes, but still retain the ones that don't map in their order, e.g [humanmappedgene1, mousegene2, humanmappedgene3, humanmappedgene4, mousegene5]. This is so I can map the input gene list directly back into the dataframe they are extracted from. The order is important as in the dataframe, mousegene1 has expression info for the samples e.g sample 1, sample 2 and sample 3 and I want to retain the expression level information for each gene that was mapped.

ADD REPLY

Login before adding your answer.

Traffic: 2077 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6