Hi,
I'm running a RNAseq experiment comparing two different conditions on the same subject... my count matrix is some like that (but much bigger):
341_MO_r150 341_SP_r150 345_MO_r129 345_SP_r129 350_MO_r178 350_SP_r178
A1BG 0 0 2 0 50 0
A1BG-AS1 0 0 0 11 3 0
A1CF 0 0 0 0 37 0
A2M 1 0 0 0 0 0
A2M-AS1 1 0 0 0 9 0
A2ML1 0 0 1 0 6 0
After creating a _DESeq2_ object in that way...
> dds <- DESeqDataSetFromMatrix(countData = as.matrix(cnt), colData = mde, design = ~ condition+ subject)
> dds <- dds[rowSums(counts(dds)) >= 10,]
> dds <- DESeq(dds, parallel = T)
> res <- results(dds)
> plotMA(res, ylim = c(-2,2), main = "res")
... I'm getting this (very) weird MA-plot:
The DESeq's author told me about check if there's some confounders related with library size, but I cannot to see any clear relation (CT values seem to have some influence but I don't know if _such_ influence to distort the MA-plot in that way...).
Any idea what I'm missing, please? Thanks a lot!
Thanks for your help, @ATpoint. Indeed, the problem was the very (excessive) variation in the number of raw counts. I tested only those samples between Q2 and Q3 (of
colSums(raw_counts)
) and I got an usual MA-plot... so the _easiest_ solution will be resequencing and balance all raw counts. So, thanks a lot!Good to hear you could figure it out. I moved this comment to answer so people having similar issues innmediately see a potential solution.
Can you give some background? What are these samples (organism, treatments), what are the groups? What is condition and subject? Is a global change in the transcriptional profile expected? How were samples preprocessed? You could also start by manually plotting the MA-plot using the raw data for the groups outside of the DESeq2 results object. Just the raw counts from the original matrix, average expression by fold change between the groups. This could help to decide if this is a general problem with your data or if the shrinkage might be corrupting things here.
Hi, @ATpoint. I'm comparing, in the same patient (eg, 341; within a big patients group), two different conditions (SP and MO)... at processed at the same time and following a 3`-RNAseq protocol. I wouldn't expect a very big difference but a bit... is this affecting the MA-plot? Thanks!
Sorry, I do not understand. So you have a lot of patients, and from each you have two samples that reprresent
condition
, right? Sosubject
is probably the patient itself? What is CT value here? Probably not qPCR CT right? Can you make a barplot using the output ofcolSums
on the raw counts. It looks like many samples have quite low read counts from this plot on the lower left, so the blue to deepblue ones.