Entering edit mode
9.4 years ago
Bioinformatist Newbie
▴
270
I am interested in finding the DEG's among 2 micro-arrays with different GPL (GPL96 vs. GPL3921). The problem is different dimensions of expression matrix and different probe ids; hence not comparable. Can anyone guide me how I can do that?
I want to compare Control from this study with Ly294002-treated from this study.
Looking forward to your suggestions.
There are a couple issues here. The simplest issue is the one you mentioned, which can be resolved by just subsetting the datasets.
The bigger issue is that you need to only compare shared probes between the two datasets. Comparing two different probes for the same gene is typically not going to yield meaningful results. I've never needed to do this, so I don't have any boilerplate code I can paste in.
Devon: Can you elaborate what you mean by subsetting the dataset, I am naive in this field hence having much limited knowledge, if you can explain it with an example (which actually works) then it will be better. Thanks.
You there?? I am waiting for your answer
I'd missed your reply.
These are basic R commands, since that's what you'll likely be using. See
help('[')
for details.GPL3921 is a high-throughput version of GPL96 so if I recall correctly there is almost complete overlap of probe IDs, and the probe sequences are identical for probesets with the same IDs. As noted, for normalization purposes you would want to normalize on the common probesets. But more fundamentally - according to GEO the two groups you are going to compare are HL-60 cells + DMSO (n=3) vs. MCF7 treated with LY-294002 (n=1!) so you're comparing very different cell types, and don't have enough samples for a robust DEG analysis here. Is this really the comparison you want to do?
Ahill: You're right, mostly probes names are similar. Can you guide me with an example how to obtain a common probeset? I think it will be an intersection, but I'm unable to find how I can do that. Either normalize the different data separately or altogether (in this case the matrix dimensions are not similar, hence not comparable). If do it differently then how to merge 2 expression sets into 1? How to do probe annotations for both types, can I do them in 1 step or that too have to be done separately?
The example I quoted was just to convey my message, I will not compare such an example. In real examples I will be comparing same drug treated sample but by different platforms. Mainly I will be using Connectivity Map dataset. I want to compare all instance in that dataset which are treated by same drug (some instances will be HG some HT) against their vehicle treated control. Can you also suggest me how to proceed with that because for 1000 arrays doing them one by one is not time efficient way. Thanks.