Combining multiple datasets increases the fgsea p-value
1
0
Entering edit mode
23 months ago
Nemo • 0

Hi,

I have three Influenza datasets from GEO (GSE34205, GSE48466, and GSE30723). Since each of the datasets have low number of control samples, I would like to put all control samples from all datasets into each other and use that against disease samples from each datasets separately. As I checked these datasets have the same platform and version. After doing so, the obtained p-value of fgsea increased which is undesirable. What can I do? Is there anything that I am doing wrongly?

enrichment fgsea dataset • 925 views
ADD COMMENT
0
Entering edit mode

Can you detail your methods a little more. It's unclear what exactly you're doing at the moment.

ADD REPLY
0
Entering edit mode

When I am performing fgsea on each dataset, the p-value for Influenza geneset is not significant. Since the samples in each group (disease and control) have different numbers, I thought using all control samples for all three datasets. I mean lets say I have 10 disaes samples and 3 control samples for the first dataset and 40 disease and 10 control samples for the second dataset. I consider 3 and 10 control samples and 10 disease samples as one single study and then run fgsea. Since I am making a bigger pool of control samples I would expect to have better p-value. However, it got worse. I hope this is clear.

ADD REPLY
0
Entering edit mode
23 months ago
mark.ziemann ★ 1.9k

You can assume the three experiments are "batches", which can be specified as unordered factors in the samplesheet. Then correct for them when setting up the model in DESeq2.

dds <- DESeqDataSetFromMatrix(countData = x , colData = samplesheet, design = ~ batch + treatment )

Another approach would be to consider each experiment separately and then use a multi-contrast enrichment tool like mitch to detect the common trends in pathway regulation.

ADD COMMENT

Login before adding your answer.

Traffic: 1612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6