Hello, I am fairly new to DGE and exploratory analysis of RNAseq data. I am looking a differentially expressed genes in 116 strains of E.coli. I used Kallisto to align and quantify my genes (using the E. coli K12 as my reference genome). I have successfully run the DESeq2 pipeline and generated some MA plots. have some trouble interpreting them. Some look quite odd in my opinion and will appreciate all and any insights. Thanks.
You get the expression change between the conditions on the Y-axis and the 'how strong are the genes expressed' on the X-axis. Normally lower expressed genes have a higher variability, that is in parts what you are correcting for with the methods implemented in the tools like DESeq2, Edger or limma-voom. The blue dots probably indicate if 'the gene is significantly' different between the conditions (under a certain threshold you have in your code).
Thank you so much for the explanation. So it is kind of similar to what I will visualize in volcano plots.
A volcano plots logFC vs -log10(p), it does not include the information about average expression. The MA does not provide information about the magnitude of the pvalue. I usually plot both as they simply provide different information.
You could plot logFC vs -log10(p) as ATpoint suggested, and add as an argument to the plotting function the size of the (gene-) dots being log10(expression) (or log2(expression) adjusted. You have to see how it looks like ...