Hello everyone,
I am just seeking confirmation about likely relationship between the baseMean
and the mean of normalised counts
measured in DESeq2. Thing is at the beginning of my analysis I use to filter genes with low counts
dds[rowSums(counts(dds)>10) >=n
despite this I still have in my gene list and thus in my MA plot genes with baseMean <1. I know that the mean of normalised counts
is the average expression between compared groups whereas the baseMean is a just the average of the normalised count values, dividing by size factors. So there should be a relationship between those values, I guess?
Could it be that they successfully passed the shrinkage due to very high log2foldchange?
So, what exactly is your question here? Your code isn't functional, and it's really not clear what you're asking.
n = the number of samples, forgot to mention.
Is baseMean = mean of normalised counts? if yes, why filtering out genes with low counts I still have these in my MAplot? The latter are genes with high log2foldchange.
Yes, baseMean is the mean of the normalized counts across all samples, not taking into account gene length. As for why low count genes are not removed, you aren't doing any filtering, as your code still isn't functional.
The DESeq2 vignette has a clear example of this: