How to configure a data set with mouse data to one with human gene names using a mapping file
1
0
Entering edit mode
22 months ago
Molang • 0

How do I use a Biomart mapping file to convert mouse genes to human gene names. The data set I need to reconfigure: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE162301 It's the GSE162301_Deseq_all_results.txt text file. I really need help.

bioMart RNA-seq • 1.9k views
ADD COMMENT
1
Entering edit mode

What methods would you like to use? Do you know R? Are you stuck in excel? Do you have a preferred tool set or environment? How do you normally work with data? Have you tried anything so far and run into a problem? You already have a mapping file? What is the format (paste a few lines so we can see). Does it matter if you get more than a 1:1 mapping? i.e. Do you have a plan for when a mouse gene matches more than one human gene? If you are using excel, something relatively easy and fast would be to use vlookup (google it) to transfer human IDs to your results file assuming both your files have mouse IDs in common, but it doesn't solve the multiple mapping problem.

ADD REPLY
0
Entering edit mode

Thanks for actually replying with something constructive. I'm happy to use 'R' and ideally that would be what I would like to use with this.BioMart mapping file:

mouse_id    human_id    human_name  mouse_name  description
ENSMUSG00000056673  ENSG00000012817 KDM5D   Kdm5d   lysine (K)-specific demethylase 5D [Source:MGI Symbol;Acc:MGI:99780]
ENSMUSG00000069049          Eif2s3y eukaryotic translation initiation factor 2, subunit 3, structural gene Y-linked [Source:MGI Symbol;Acc:MGI:1349430]
ENSMUSG00000068457  ENSG00000183878 UTY Uty ubiquitously transcribed tetratricopeptide repeat containing, Y-linked [Source:MGI Symbol;Acc:MGI:894810]

What do you suggest I do? I just want to alter the data set I have in such a way that they mouse gene names are now human ones. I really need it for my University project.

ADD REPLY
1
Entering edit mode

Thanks for actually replying with something constructive

I'm doing my best to ignore your passive aggressive attitude. Statements like "urgent" or "help please" work against you in a professional/scientific forum, so a suggestion to not use them is not a joke.

ADD REPLY
1
Entering edit mode

This still isn't enough. Few people are willing to just hand out code to solve problems, unless they happen to have exactly the right few lines of code at their fingertips. Which no one can possibly know because you refuse to show what your input data looks like. If you put the first few lines of your input data, and what you want the output to look like, and your first stab at what the code should look like, and either why output is incorrect, or what you think the error given means and how you attempted to solve it, people will be far, far more willing to help you.

And to second what Ram said, your lack of planning inspires no urgency on the part of anyone here. People are here as volunteers, answering questions because they think the premise is a little interesting, or to gain familiarity with board conventions and unofficial karma for the day when we have our own questions. It is highly presumptuous to presume to tell other people that your problems are urgent for them.

ADD REPLY
1
Entering edit mode

URGENT

No. Just no.

ADD REPLY
1
Entering edit mode

This file contains mappings of human homologs for mouse genes: http://www.informatics.jax.org/downloads/reports/HOM_MouseHumanSequence.rpt

ADD REPLY
0
Entering edit mode

Okay but what do I do with that? How do I implement that and get my data set into the right format?

ADD REPLY
0
Entering edit mode

It's a table of MGI (mouse) IDs mapped to HGNC (human) IDs. You need to map mouse genes to human genes. What more do you need - a resource and code that does your job?

ADD REPLY
0
Entering edit mode
22 months ago
seidel 11k

You could try using the merge() command to "join" the two dataframes together based on a common ID. If you google join tables, you'll see info on many of the issues involved (what to do when lines don't match, or when there are multiple matches, etc.). But here's some simple toy code to give an idea of how to merge two dataframes when they have something in common.

# create a toy df with mouse IDs and fc values
df1 <- data.frame(m_ids=paste0("g", 1:10), fc=rnorm(10))

# create a sample of human IDS
humanID_possibilities <- c(paste0("hg", 1:20), "NA")

# create a df of mouse to human IDs
m2h <- data.frame(m_ids=sample(paste0("g", 1:10),10), h_ids=sample(humanID_possibilities, 10))

# merge them based on human ids
merge(df1, m2h, by.x="m_ids", by.y="m_ids", all.x=TRUE)

   m_ids           fc h_ids
1     g1 -0.463877051  hg18
2    g10 -0.582063249  hg10
3     g2  0.129008762   hg3
4     g3  1.548453634    NA
5     g4  1.060394485  hg16
6     g5 -1.646870332  hg11
7     g6  0.396264599   hg1
8     g7 -0.216049804  hg13
9     g8 -0.366739414   hg6
10    g9 -0.004712061  hg19

This is a common scenario say when trying to merge a gene expression table with a table of gene descriptions with an ID in common.

edit: I just saw that this is for a University project....in that case, it's definitely something you should figure out for yourself, as otherwise you're missing the point of the exercise. Nonetheless, here's an example of how to have an idea, then write some toy code to test it and explore before implementing it.

ADD COMMENT
0
Entering edit mode

Okay, I'm familar with that function ,I'll see what I can do. Thank you so much. I thought I had a word limit in my initial post so that's why there's little explanation. The issue is that I shouldn't have to be reconfiguring this data ,my University messed up and I haven't received any help with my project, I don't even have a Supervisor. I'm on here because I'm not getting any help when I should be as I don't have a background in data analysis. I just have no clue how to use the mapping file, didn't even know what one was until today. I just don't know where to start in 'R.'

ADD REPLY

Login before adding your answer.

Traffic: 1855 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6