Question

Weird MA-plot (DESeq2)

0

Entering edit mode

4.6 years ago

garcesj ▴ 50

Hi,

I'm running a RNAseq experiment comparing two different conditions on the same subject... my count matrix is some like that (but much bigger):

         341_MO_r150 341_SP_r150 345_MO_r129 345_SP_r129 350_MO_r178 350_SP_r178
A1BG               0           0           2           0          50           0
A1BG-AS1           0           0           0          11           3           0
A1CF               0           0           0           0          37           0
A2M                1           0           0           0           0           0
A2M-AS1            1           0           0           0           9           0
A2ML1              0           0           1           0           6           0

After creating a _DESeq2_ object in that way...

> dds <- DESeqDataSetFromMatrix(countData = as.matrix(cnt), colData = mde, design = ~ condition+ subject)
> dds <- dds[rowSums(counts(dds)) >= 10,]
> dds <- DESeq(dds, parallel = T)
> res <- results(dds)
> plotMA(res, ylim = c(-2,2), main = "res")

... I'm getting this (very) weird MA-plot: enter image description here

The DESeq's author told me about check if there's some confounders related with library size, but I cannot to see any clear relation (CT values seem to have some influence but I don't know if _such_ influence to distort the MA-plot in that way...).

enter image description here

Any idea what I'm missing, please? Thanks a lot!

RNA-Seq DESeq2 confounders • 3.1k views

ADD COMMENT • link updated 4.6 years ago by GouthamAtla 12k • written 4.6 years ago by garcesj ▴ 50

1

Entering edit mode

Thanks for your help, @ATpoint. Indeed, the problem was the very (excessive) variation in the number of raw counts. I tested only those samples between Q2 and Q3 (of colSums(raw_counts)) and I got an usual MA-plot... so the _easiest_ solution will be resequencing and balance all raw counts. So, thanks a lot!

ADD REPLY • link 4.6 years ago by garcesj ▴ 50

0

Entering edit mode

Good to hear you could figure it out. I moved this comment to answer so people having similar issues innmediately see a potential solution.

ADD REPLY • link 4.6 years ago by ATpoint 85k

0

Entering edit mode

Can you give some background? What are these samples (organism, treatments), what are the groups? What is condition and subject? Is a global change in the transcriptional profile expected? How were samples preprocessed? You could also start by manually plotting the MA-plot using the raw data for the groups outside of the DESeq2 results object. Just the raw counts from the original matrix, average expression by fold change between the groups. This could help to decide if this is a general problem with your data or if the shrinkage might be corrupting things here.

ADD REPLY • link 4.6 years ago by ATpoint 85k

0

Entering edit mode

Hi, @ATpoint. I'm comparing, in the same patient (eg, 341; within a big patients group), two different conditions (SP and MO)... at processed at the same time and following a 3`-RNAseq protocol. I wouldn't expect a very big difference but a bit... is this affecting the MA-plot? Thanks!

ADD REPLY • link 4.6 years ago by garcesj ▴ 50

1

Entering edit mode

Sorry, I do not understand. So you have a lot of patients, and from each you have two samples that reprresent condition, right? So subject is probably the patient itself? What is CT value here? Probably not qPCR CT right? Can you make a barplot using the output of colSums on the raw counts. It looks like many samples have quite low read counts from this plot on the lower left, so the blue to deepblue ones.

ADD REPLY • link 4.6 years ago by ATpoint 85k

score 1 · Answer 1 · 2020-04-17

1

Entering edit mode

4.6 years ago

GouthamAtla 12k

If you are comparing between conditions with in each patient ( a paired analysis) your contrast looks wrong to me.

~~design = ~ condition+ subject~~

Shouldn't it be:

design = ~ subject + condition

If you want to trouble shoot:

plot the distribution of normalised gene expression for all samples and see if there is any weird distributions.
Plot PCA. In this case, you might see the samples dont separate according to condition, but will group by subjects. In any case, you will spot outlier samples.

ADD COMMENT • link 4.6 years ago by GouthamAtla 12k

0

Entering edit mode

Thanks for your attention @geek_y. Yes, indeed, it'd be the way you point out... it was an typo error.

ADD REPLY • link 4.6 years ago by garcesj ▴ 50