Question

How To Manage When A Probe Is Mapped With Multiple Uniprot Ids

0

Entering edit mode

11.7 years ago

TitoPullo ▴ 190

I'm new to BioInformatics and I am working with "benchmark" cancer dataset. I need to retrieve information from GO database, so I'm trying to convert my dataset probe IDs to Uniprot IDs (I'm using MadGene tool). But I have two problems:

1) Some probes are mapped with multiple Uniprot IDs (I.e U70732rna1at -> Q7Z4T9 and P24298). How should I manage this situation? Which Uniprot ID do I have to select for my probe?

2) Sometime different probes are mapped with the same Uniprot IDs? What is the best method to resolve this problem? Should I average the gene expression values of such probes?

uniprot mapping id • 4.6k views

ADD COMMENT • link updated 11.7 years ago by aravind ramesh ▴ 540 • written 11.7 years ago by TitoPullo ▴ 190

3

Entering edit mode

For the first question, may be you can take a lood at Uniprot IDs themselves. Using your example U70732rna1at -> Q7Z4T9 and P24298, if the first one (Q7Z4T9) is a swissprot entry and second one (P24298) is a Trembl entry, I would say pick the first one.

ADD REPLY • link 11.7 years ago by Sudeep ★ 1.7k

0

Entering edit mode

What I do is, select any Id for that and treat every ID are synonymous. When ever I download any data, first I will format the data according to the IDs I have. So that it will never be a problem what Id you get(any of the multiple IDs). I know this might not be the appropriate way but. I'vnt got any other solutions in my mind.

ADD REPLY • link 11.7 years ago by aravind ramesh ▴ 540

0

Entering edit mode

So you basically duplicate the data about that probe as many time as the number of "equivalent" IDs you get, isn't it?

ADD REPLY • link 11.7 years ago by TitoPullo ▴ 190

0

Entering edit mode

no there is no duplication involved here. Just treat all the multilpe Ids mapped to Id as one.

ADD REPLY • link 11.7 years ago by aravind ramesh ▴ 540

0

Entering edit mode

Can you please make an example?

ADD REPLY • link 11.7 years ago by TitoPullo ▴ 190

0

Entering edit mode

Plz check the answer below

ADD REPLY • link 11.7 years ago by aravind ramesh ▴ 540

score 0 · Answer 1 · 2013-03-19

This is how I usually do when One Id got mapped to several other IDs, Ex: take a RefSeq Protein Id(NP_12345), and use it for ID mapping to SwissProt IDS and it got mapped to multiple entries in the database(ABCD and ACXF). Treat both of them as synonymous. There can be four different possible cases which are,

a) Both of the IDs are present in the downloaded data. Sol: Club the entries found for both in to one, and name it as any one of your choice.

b)Only First ID is present in the data. Sol: No Problem of ID mapping to mulitple IDs

c)Only Second ID is present in the data. Sol: No Problem of ID mapping to mulitple IDs(Same as above)

d)No data for both the IDs Sol: No problem of ID mapping ever.

I hope this will help you, and again I mention that, this is what I follow. I dont know this is the correct way or not. Use cautiously.