Question

filter before differential expression analysis

0

Entering edit mode

4.7 years ago

jonathanpa12 ▴ 10

Hello, everyone

I am analyzing a datasets of RNA-seq from different stages of cervical cancer. I would like to identify up or down-regulated immune related genes between the different stage. I have a list of the immune related genes and I would like to filter these genes before the differential expression analysis, using Deseq2, but I don't know if this could be a correct method to achieve my objective. Any help or suggestion? Thank you so much.

Jonathan Pena

RNA-Seq • 2.4k views

ADD COMMENT • link updated 4.7 years ago by i.sudbery 20k • written 4.7 years ago by jonathanpa12 ▴ 10

0

Entering edit mode

Why do you want to filter? Is the reason that those immune-related genes do not show up in the final list of differentially expressed genes? Or do you have a very strong biological rationale that the immune-related genes should be the ones driving the differences between the different conditions that you're analyzing?

ADD REPLY • link 4.7 years ago by Friederike 9.0k

0

Entering edit mode

The final list of differentially of expressed gene shows only two immune-related gene. I was thinking that those genes usually have very low expression rates but biologically minimal differences in gene expression sometimes may produce significant changes in immune response, Because of this I thought maybe filtering them before differential expression analysis could be an option, but I don’t have a lot of experience in this type of analysis and I don’t know if it would be right to do it.

ADD REPLY • link 4.7 years ago by jonathanpa12 ▴ 10

0

Entering edit mode

When you say "final list of differentially of expressed gene shows only two immune-related gene", do you mean they are not DE, or they are NA?

ADD REPLY • link 4.7 years ago by i.sudbery 20k

0

Entering edit mode

Sorry, I must learn to ask better. I used Deseq2 to make the differential expression analysis and I got 11 genes using as significance thresholds: Adjusted p-value: 0.05 and Log2 fold change: 1. Only one of the 11 genes is a immne-related gene of the list that I created manually.

ADD REPLY • link 4.7 years ago by jonathanpa12 ▴ 10

0

Entering edit mode

So, you are saying you have a total of 11 DE genes of which 1 overlaps with a hand-picked list of genes of interest?

ADD REPLY • link 4.7 years ago by Friederike 9.0k

0

Entering edit mode

Exactly. And I would like to validate experimentally this analysis later, so I think if I can get more genes would be better.

ADD REPLY • link 4.7 years ago by jonathanpa12 ▴ 10

0

Entering edit mode

That doesn't sound as if there's a lot to be done here. 11 DEG is not a whole lot to begin with, which makes me think that these samples aren't really that different (or there's a lot of variability between the replicates). You can always check the p-value (not the adjusted p-value) of the genes of interest, which will give you some insights into whether these immuno-related genes show any promise in these samples. After all, pre-filtering mostly influences the severity with which the "raw" p-values are adjusted as Ian has pointed out in his answer below.

ADD REPLY • link 4.7 years ago by Friederike 9.0k

0

Entering edit mode

Thank you so much for your help

ADD REPLY • link 4.7 years ago by jonathanpa12 ▴ 10

Ram · Answer 1 · 2020-03-30

If there is a certain subset of genes you are interested in, I would do the filtering after applying the DESeq2 analysis. This is because information across all genes is used for the calculation of normalisation factors and variances. If you are worried that filtering after means you loose power due to multiple testing, you can run DESeq2, subset to the genes you are interested in, and then re-calculate the padj column using p.adjust. I would also probably turn off filtering for low expression.