Compare pathway activation across microarray datasets?
1
0
Entering edit mode
3.8 years ago
thyleal ▴ 160

Dear colleagues,

I have some microarray datasets with distinct diseases in each of them and I'd like to generate a score to indicate activity of a given pathway in a way comparable across datasets. Most of them are the same Affymetrix platform, but not all.

I started by reanalyzing each from scratch and used GCRMA to background correct and normalize (quantile).

So far, I have tried:

  1. Calculating a molecular distance to health score, using a normal/control/baseline present within each dataset, but the results are suspicious because irrespective of the gene set, there are huge discrepancies among diseases. This method is described in Pankla et al. 2009

  2. Calculating GSVA and ssGSEA scores separetely in each dataset, but I'm not sure If I can compare those scores across the datasets. If not, should I take the ratios between case and control of scores within datasets and compare between datasets?

Unfortunately, I can not merge the datasets, adjust for dataset effects, and perform differential expression directly because they do not contain the same group of samples in each.

Thanks.

microarray normalization scores • 1.2k views
ADD COMMENT
0
Entering edit mode

Have you tried the gene set test functions in limma? (https://rdrr.io/bioc/limma/man/geneSetTest.html) This allows you to define and score arbitrary gene sets. Thus if you have various pathways, it's an easy way to generate a score for each pathway in each data set, and maybe you can make sense of a matrix of scores that would contain your pathway of interest as well as other pathways that you can use as "controls", or even random selections of gene sets to establish some sense of variance. (I guess this is similar to what you've already tried in 2). Might not be statistically robust, but you could probably generate a heat map to see if indeed your pathway shows "activity" distinct from others.

ADD REPLY
0
Entering edit mode

Thanks for the suggestion. The score you refer as output of the geneSetTest function in limma is a P-value, right?

ADD REPLY
0
Entering edit mode
2.2 years ago
xingyu ▴ 10

I want to ask a question about your propose. do you want to compare the differential pathways in the identical sample?

ADD COMMENT
0
Entering edit mode

Hi there. I wanted to get a score representative of the pathway activity by comparing cases vs controls for many diseases from independent datates. In the end, I ended up calculating standardized effect sizes for the log2FC and aggregating them per pathway (median) to compare across datasets.

ADD REPLY

Login before adding your answer.

Traffic: 1229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6