Asymetric Differential Gene Expression with Sleuth
0
0
Entering edit mode
2.8 years ago

Hey, I am unsure how trustworthy my differential gene Expression (DGE) analysis is and was hoping someone might have an idea.

I want to analyses Bulk-Sequencing data from two tissues (P and G) (each with 2 replicates) with sleuth. However, when I visualize the data in a vulcanoplot it is highly asymmetrical. When I look at the scaled_reads_per_base of the genes (2. Plot) I see way less with Zero-Values in the G tissues and instead a lot more genes with small read coverage (below 30, vertical line). I thought, that this might messes with the normalization. But even applying a very stringent filter and throwing out all the genes which have a coverage below 30 in both tissues, does not help. Zooming in on those genes (above the threshold) shows a distinct shift towards higher values in the G tissue (both for scaled_reads_per_base (3. Plot) and tmp values (not shown)).

Now I am wondering, is this a real thing and genes are in general higher expressed in G or is this an artifact the normalization does not take care of. And is there a way to distinguish between both scenarios?

I also tried other normalization techniques (TMM/quantile/RPKM) and methods (Deseq2) but always get a shift of the data.

Vulcanoplot of DEG from sleuth; desityplot for  scaled_reads_per_base bevor (2. Plot) and after (3.Plot) filtering

Edit: Added the MA Plot.

enter image description here

DGE sleuth • 1.3k views
ADD COMMENT
0
Entering edit mode

Can you show an MA-plot (baseMean vs logFC)? baseMean is basically the rowMeans of the normalized counts on log2 scale. If the shift is technical then you will need to try and define control genes to center the normalization on, but lets see the plot first.

ADD REPLY
1
Entering edit mode

thanks for the reply, I added the MA-Plot in the post. Besides the asymmetry described above I don't see a particular trend.

ADD REPLY
0
Entering edit mode

I agree. Based on this there is not much reason to believe that this shift is technical, so it could indeed by a biological feature.

ADD REPLY
0
Entering edit mode

I nevertheless tried a normalization based on house keeping genes (about 2500) and included the normalization factors into the linear sleuth model with the RUVSeq Package like here. This removes most of the asymmetry. Do you think this is a reasonable thing to do, or would you advice against it?

ADD REPLY
2
Entering edit mode

I wouldn't recommend doing this; with that many genes downregulated from your volcano plot, a large fraction of 2500 genes will be downregulated. If you normalize by those 2500 genes, of course the asymmetry will be removed (you're normalizing out a good part of the downregulation).

Unless you actually know that those 2500 genes aren't downregulated, don't normalize by them.

Your beta values are the log fold changes and there are clearly more downregulated genes. Don't artificially shift things to make your data more symmetrical.

That's my opinion. However, it's worth taking a closer look at the genes -- do some GO/enrichment analysis and validate your findings, open up Excel and rank your genes in each sample from highest to lowest in expression and see how the rankings compare between samples, etc.

ADD REPLY

Login before adding your answer.

Traffic: 1615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6