Question

Microarray data from different sets but same platform

0

Entering edit mode

3.7 years ago

Ahmed.waraky ▴ 10

Hi guys I have a question, there is two different studies each used the same Array and platform A-AFFY-44 -Affymetrix GeneChip Human Genome U133 Plus 2.0 [HG-U133_Plus_2], can i combine data from each of these studies?? Like in one study they have a good control and in the second study they have specific type of leukemia and do differential expression between them?? /Thanks!!

microarray R gene Affymetrix • 989 views

ADD COMMENT • link updated 3.7 years ago by ATRX ★ 1.1k • written 3.7 years ago by Ahmed.waraky ▴ 10

score 0 · Answer 1 · 2021-03-04

0

Entering edit mode

3.7 years ago

ATRX ★ 1.1k

Do the following before combing the dataset:

Check the metadata file for both experiments.
Check if all the probes that are measured are consistent across both the dataset
Combine the datasets, normalize the matrix and then check for possible batch effects using PCA and MDS (i.e. if the samples are getting clustered based on experiments).
If yes, then apply batch correcting algorithms such as sva or ComBat and get the batch corrected matrix
Apply PCA and MDS on the batch corrected matrix and check if the samples clustered based on batches or not. If yes, then use the dataset at your own peril.
If not, then do the downstream analyses.

All the Best,

-Ar

ADD COMMENT • link 3.7 years ago by ATRX ★ 1.1k

1

Entering edit mode

If yes, then apply batch correcting algorithms such as sva or ComBat and get the batch corrected matrix

Batch is confounded by study, it is therefore mathematically not possible to correct for anything.

To answer the question, no you should not combine these datasets as you cannot tell whether differences you see are due to biological effects or batch as these cannot be separated. batch1=tumor; batch2=normal, hence batch is the same as "condition". Technically you can combine them and run any analysis, but be aware that results are not reliable, at best it is for exploratory analysis.

ADD REPLY • link 3.7 years ago by ATpoint 85k

0

Entering edit mode

What if both studies used the same probes and there was no batch effects?? Theoritically that would indicate any differences are only biological right?? Or not?

ADD REPLY • link 3.7 years ago by Ahmed.waraky ▴ 10

2

Entering edit mode

The probes are not the problem, rather the exact wetlab procedure, how they extracted RNA, did cDNA etc. Yes, if there was no batch effect you could do it, but as batch=condition there is no way to diagnose it. From what I've seen in RNA-seq there is almost a guarantee that you have a massive batch effect.

ADD REPLY • link 3.7 years ago by ATpoint 85k

0

Entering edit mode

What if I checked for RNA degradation with the affy package? Shouldn't that give an indication for the RNA? And doing normalization across samples can compensate for any differences? So there is no way to check for batch difference if batch=condition?

ADD REPLY • link 3.7 years ago by Ahmed.waraky ▴ 10