What do these MA plots signify?
1
0
Entering edit mode
3.7 years ago
fufor94 ▴ 10

Hello, I am fairly new to DGE and exploratory analysis of RNAseq data. I am looking a differentially expressed genes in 116 strains of E.coli. I used Kallisto to align and quantify my genes (using the E. coli K12 as my reference genome). I have successfully run the DESeq2 pipeline and generated some MA plots. have some trouble interpreting them. Some look quite odd in my opinion and will appreciate all and any insights. Thanks. strain 1 vs ref strain

strain 2 vs ref strain

strain 3 vs ref strain

seq rna maplot sequencing deseq2 • 8.3k views
ADD COMMENT
3
Entering edit mode

You get the expression change between the conditions on the Y-axis and the 'how strong are the genes expressed' on the X-axis. Normally lower expressed genes have a higher variability, that is in parts what you are correcting for with the methods implemented in the tools like DESeq2, Edger or limma-voom. The blue dots probably indicate if 'the gene is significantly' different between the conditions (under a certain threshold you have in your code).

ADD REPLY
0
Entering edit mode

Thank you so much for the explanation. So it is kind of similar to what I will visualize in volcano plots.

ADD REPLY
2
Entering edit mode

A volcano plots logFC vs -log10(p), it does not include the information about average expression. The MA does not provide information about the magnitude of the pvalue. I usually plot both as they simply provide different information.

ADD REPLY
1
Entering edit mode

You could plot logFC vs -log10(p) as ATpoint suggested, and add as an argument to the plotting function the size of the (gene-) dots being log10(expression) (or log2(expression) adjusted. You have to see how it looks like ...

ADD REPLY
6
Entering edit mode
3.7 years ago
ATpoint 85k

The first two are perfectly normal if you ask me, you simply have few DEGs. The third one indicates an unbalanced DEG profile, many genes going down in one condition. Nothing "odd" as far as I am concerned. You can use lfcShrink to shrink the logFCs, that might correct some of these larger FCs on the bottomleft of the plot, see https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#log-fold-change-shrinkage-for-visualization-and-ranking Shrinkage is basically a moderation of the fold changes. Simplified, if counts are decently high and/or standard errors of the lfcs between replicates are small then the lfcs are probably trustworthy and will stand as-is, and if not (large standard errors) then they get shrunken towards zero (as there is little/no evidence that the large biased lfcs are in fact real and not artifacts due to low counts or large variation between replicates). This is why shrunken lfcs are good for both visualization and ranking by effect size (=lfc).

ADD COMMENT
0
Entering edit mode

Thanks, I will try shrinking the logFCs like you suggested.

ADD REPLY

Login before adding your answer.

Traffic: 1845 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6