Imbalance up and down regulated genes
1
1
Entering edit mode
6.5 years ago
Angela ▴ 30

Hello Biostars, I did differential expression analysis by DESeq2 whit LFC=1 and FDR=005, but I found a list of significant differentially expressed genes which is imbalanced, (1345 upregulated and 38 downregulated genes). Does it mean my analysis is not correct or this can happen? thanks

rna-seq • 4.2k views
ADD COMMENT
2
Entering edit mode

Why should the number be balanced? You may indeed have more up-regulated genes than down-regulated ones in this analysis.

ADD REPLY
2
Entering edit mode

I've seen this before and it's always something to worry about. However, I've also seen it when analysing a dataset where an inhibitor of a global repressor of expression has been used - and it was consistent with other similar datasets.

You've got to wonder whether there is real biology going on, or whether there is a problem with your raw data or your pipeline. So, are there large differences in the quality, or the size, or the complexity of the libraries? On an MA plot, is there any trend in logFC with respect to average expression? If you iterate dropping-out each sample and rerunning DESeq2, does the bias remain in every iteration? Is there a confounding batch-effect between the two arms?

ADD REPLY
0
Entering edit mode

Thanks Yes, there is a batch effect between samples, they’re not from the same run and I did the analysis between 24 test and 3 control samples, my test samples are tumor samples that are infected by a virus. I tried many ways of analysis without removing batch effect, the number of upregulated gene is 1249 and downregulated are 14, and after removing batch effects the number of upregulated gene is 1345 and downregulated genes are 38, The number of expressed genes are not the same as well !

ADD REPLY
0
Entering edit mode

could you post an MA plot

ADD REPLY
0
Entering edit mode

Hello Russh, sorry for my late reply here is the MAplot after using lfcshrink

after lfcshrink

ADD REPLY
0
Entering edit mode

Could you repost that and only highlight the 38 downregulated and the 1345 upregulated features. There's looks like a fairly even directional balance above mean ~ 100 normalised counts.

ADD REPLY
0
Entering edit mode

MAplot

ADD REPLY
0
Entering edit mode

Hello Russhh, I posted the new MAplot

ADD REPLY
2
Entering edit mode
6.5 years ago

When you apply cutoff on basis of p-values, you will get more number of significantly upregulated or downregulated genes. But if you filter those genes with FDR values (<0.05 in most case) the number will decrease. The genes which have FDR < 0.05 are confidently significant. So, if you got less number of downregulated genes, that doesn't mean your analysis goes wrong. It can happen, depending on the controls you have taken.

Then also, I would suggest you to revise LFC cutoff and check your control group is good enough.

ADD COMMENT

Login before adding your answer.

Traffic: 2065 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6