Question

RNA-seq DESeq2 : p-values and venn plots in same analysis

0

Entering edit mode

6.5 years ago

BioHazzard • 0

I am doing differential expression analysis. I am comparing two different experiments, each experiment consisting of two treatments and their respective controls in duplicate.

I used DESeq2 to generate a distinct results object for each of the 4 control/treatment pairs and am doing downstream analysis on genes with the adjusted p-value below 0.01.

My question regards the difference between considering genes differentially expressed based on the p-value, which is continuous and comparing the result with a heatmap. Again, p-value thresholds are taken from the DESeq2 results object generated for each of the conditions.

I will illustrate this with two images. These images take into consideration only two of the 4 conditions.

The venn diagram looks like this:

enter image description here

So in each condition a certain number of genes were differentially expressed and the overlaps between the two conditions are shown. In this example, in condition A there are 275 genes that are only differentially expressed in that condition.

However, when I create a heatmap of those genes, which should be exclusively differentially expressed in condition A, I observe that there is also an obvious difference in condition B, even if less strong. Note that the columns in the heatmap are ordered:

        CTR  CTR  CTR CTR TREAT TREAT TREAT TREAT
         A    A    B   B    A     A     B     B

enter image description here

The heatmap tells a different story than the venn diagram. While simply using the p threshold I can define genes as being uniquely differentially expressed in one condition only, the heatmap makes conditions A and B look much more similar, as also shown by the clustering.

Any tips or insight would be greatly appreciated.

RNA-Seq DESeq2 heatmap venn • 4.4k views

ADD COMMENT • link updated 6.5 years ago by devbt15 ▴ 30 • written 6.5 years ago by BioHazzard • 0

0

Entering edit mode

Hi, I dont know if its the hospital firewall, but the images are not visible to me.

ADD REPLY • link 6.5 years ago by caggtaagtat ★ 1.9k

0

Entering edit mode

Can you recommend me an image hosting service that you know you can see?

ADD REPLY • link 6.5 years ago by BioHazzard • 0

0

Entering edit mode

No, I think I have to apply for a change in my firewall. Just wanted make sure its because of me

How did you create your DESeqDataSet ?

ADD REPLY • link 6.5 years ago by caggtaagtat ★ 1.9k

score 0 · Answer 1 · 2018-06-13

0

Entering edit mode

6.5 years ago

devbt15 ▴ 30

As we cannot see the sample names it is hard to comment but I would presume that the order is same as in the text provided above. As such a heat map command would portray the absolute values from DESeq2 and here we can see that it is similar in both A and B in control and treatment respectively (It will not consider the p-value while doing so, which you considered on the other hand while calculating your DEGs in the Venn). This would mean that there is narrow expression difference between A and B samples (treatment vs. control). I would suggest you plot log2Fold change (treatment vs. control) for A and B (so containing 2 columns only), to see a better difference and also scale the data before plotting (so the heatmap scale goes from -1 to +1). Regards.

ADD COMMENT • link 6.5 years ago by devbt15 ▴ 30

0

Entering edit mode

Thanks for the reply. My question is less about how to graph the data but more about how to interpret the data.

If I am trying to make statements such as "275 genes showed modified expression only in condition A", they the venn diagram could lead me to make such a statement. However, I am reluctant to make such a statement because when I look at the heatmap, it tells me that those genes are clearly also regulated. Evidently, they are regulated, but the magnitude of differential expression is smaller, so that their p-values are above the threshold.

It looks to me as if venn diagrams are not very good for DE analysis.

Is there another way to analyse similarity between the conditions?

ADD REPLY • link 6.5 years ago by BioHazzard • 0

0

Entering edit mode

Can you maybe group you data in CTR_A, CTR_B, TREAT_A, TREAT_B and than do DGE between condition CTR_A and CTR_B using one of them as the reference? Or are the experiments to different?

ADD REPLY • link 6.5 years ago by caggtaagtat ★ 1.9k