I am running a DE pipeline on a bulk blood dataset. Each patient has two RNA-seq samples, pre-exposure and post-exposure. I am using limma voom for the analysis. My mean-variance plot looks strange. I ran standard pre-processing (STAR + featurecounts), so I'm not sure why this would be unless bulk blood samples have properties that would cause these peculiarities. Are there any particular situations in which a plot like this would arise? Bulk blood is generally pretty egregious, so after re-running several times looking for errors in my code (and not finding anything), I suspect that might be the issue. Below is the mean-variance plot as well as my code.
design <- model.matrix(~0 + exposure + within_subject_cov_1 + within_subject_cov_2 + within_subject_cov_3,data=info)
counts <- voom(dge,design)
corfit <- duplicateCorrelation(counts,design,block=info$subject)
corfit$consensus
counts_voom <- voom(dge,design,block=info$subject,correlation=corfit$consensus)
Thanks.
You are right that looks strange! Seems like you mainly have very (!) large count? Also did you use edgeR::calcNormFactors?
Yes, I used calcNormFactors with the TMM method prior to voom normalization on the dge object.
Nice! Have you filtered your expression matrix a lot? It looks like it is missing all the low counts...