Microarray data from different sets but same platform
1
0
Entering edit mode
3.7 years ago
Ahmed.waraky ▴ 10

Hi guys I have a question, there is two different studies each used the same Array and platform A-AFFY-44 -Affymetrix GeneChip Human Genome U133 Plus 2.0 [HG-U133_Plus_2], can i combine data from each of these studies?? Like in one study they have a good control and in the second study they have specific type of leukemia and do differential expression between them?? /Thanks!!

microarray R gene Affymetrix • 990 views
ADD COMMENT
0
Entering edit mode
3.7 years ago
ATRX ★ 1.1k

Do the following before combing the dataset:

  1. Check the metadata file for both experiments.
  2. Check if all the probes that are measured are consistent across both the dataset
  3. Combine the datasets, normalize the matrix and then check for possible batch effects using PCA and MDS (i.e. if the samples are getting clustered based on experiments).
  4. If yes, then apply batch correcting algorithms such as sva or ComBat and get the batch corrected matrix
  5. Apply PCA and MDS on the batch corrected matrix and check if the samples clustered based on batches or not. If yes, then use the dataset at your own peril.
  6. If not, then do the downstream analyses.

All the Best,

-Ar

ADD COMMENT
1
Entering edit mode

If yes, then apply batch correcting algorithms such as sva or ComBat and get the batch corrected matrix

Batch is confounded by study, it is therefore mathematically not possible to correct for anything.

To answer the question, no you should not combine these datasets as you cannot tell whether differences you see are due to biological effects or batch as these cannot be separated. batch1=tumor; batch2=normal, hence batch is the same as "condition". Technically you can combine them and run any analysis, but be aware that results are not reliable, at best it is for exploratory analysis.

ADD REPLY
0
Entering edit mode

What if both studies used the same probes and there was no batch effects?? Theoritically that would indicate any differences are only biological right?? Or not?

ADD REPLY
2
Entering edit mode

The probes are not the problem, rather the exact wetlab procedure, how they extracted RNA, did cDNA etc. Yes, if there was no batch effect you could do it, but as batch=condition there is no way to diagnose it. From what I've seen in RNA-seq there is almost a guarantee that you have a massive batch effect.

ADD REPLY
0
Entering edit mode

What if I checked for RNA degradation with the affy package? Shouldn't that give an indication for the RNA? And doing normalization across samples can compensate for any differences? So there is no way to check for batch difference if batch=condition?

ADD REPLY

Login before adding your answer.

Traffic: 1966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6