Question

Combining multiple datasets increases the fgsea p-value

0

Entering edit mode

2.5 years ago

Nemo • 0

Hi,

I have three Influenza datasets from GEO (GSE34205, GSE48466, and GSE30723). Since each of the datasets have low number of control samples, I would like to put all control samples from all datasets into each other and use that against disease samples from each datasets separately. As I checked these datasets have the same platform and version. After doing so, the obtained p-value of fgsea increased which is undesirable. What can I do? Is there anything that I am doing wrongly?

enrichment fgsea dataset • 1.2k views

ADD COMMENT • link updated 2.5 years ago by mark.ziemann ★ 2.0k • written 2.5 years ago by Nemo • 0

0

Entering edit mode

Can you detail your methods a little more. It's unclear what exactly you're doing at the moment.

ADD REPLY • link 2.5 years ago by rpolicastro 13k

0

Entering edit mode

When I am performing fgsea on each dataset, the p-value for Influenza geneset is not significant. Since the samples in each group (disease and control) have different numbers, I thought using all control samples for all three datasets. I mean lets say I have 10 disaes samples and 3 control samples for the first dataset and 40 disease and 10 control samples for the second dataset. I consider 3 and 10 control samples and 10 disease samples as one single study and then run fgsea. Since I am making a bigger pool of control samples I would expect to have better p-value. However, it got worse. I hope this is clear.

ADD REPLY • link 2.5 years ago by Nemo • 0

score 0 · Answer 1 · 2022-12-16

You can assume the three experiments are "batches", which can be specified as unordered factors in the samplesheet. Then correct for them when setting up the model in DESeq2.

dds <- DESeqDataSetFromMatrix(countData = x , colData = samplesheet, design = ~ batch + treatment )

Another approach would be to consider each experiment separately and then use a multi-contrast enrichment tool like mitch to detect the common trends in pathway regulation.