DESeq2 MA plot - linear relationship for pooled amplicons
1
0
Entering edit mode
2.1 years ago
bibrgr • 0

Hello all,

I am sequencing samples from a barcoded bacteria library (essentially pooled amplicon sequencing a la shRNA/CRISPR) and have noticed a strange correlation between normalized count mean and LFC on an MA plot. Here's the plot: enter image description here

It seems like there is a correlation between LFC and mean in an otherwise normal-appearing MA plot. These samples are from the library pre- and post- selection. I interpret this as describing a global change in expression (in my case, population) from pre-selection/control to post-selection/treatment (which may violate assumptions of DESeq2), is that right? Also, since the correlation is linear, is there a way to correct for this trend?

The image suffers from link rot, but I imagine this could be analogous to the situation described in Weird MA-plot from array data - help? - although I'm not sure if a consensus was reached there.

Analysis was performed with default DESeq2 parameters except with Cook's cutoff off (which doesn't affect the trend of the MA plot), and genes with low counts were discarded. The size factors range from 0.33-1.75.

Thank you!

MA plot DESeq2 • 1.1k views
ADD COMMENT
0
Entering edit mode

Is this a dropout screen? Do you expect a balanced DE profile or rather mostly dropouts? Is this normalized to some sort of non-targeting control? Not sure whether you need DESeq2 or rather some specialized approach such as MAGeCK which after all uses some of the ideas of DESeq2 under the hood.

ADD REPLY
0
Entering edit mode

Since different strains are competing with each other, I expect both dropout and enrichment. However, I don't have sgRNAs but rather a barcode library, so I only have 1-2 knockouts/gene and much higher efficiency of knockout.

ADD REPLY
0
Entering edit mode
2.1 years ago

That MAplot does not look right.

You might use this code mentionedto see if your data fits the negative binominal model that DESeq expects.

https://github.com/bioramble/sequencing/blob/master/nb.R

If it doesn't then you can't use DESeq. The other thing you could try is to pick out a set of 1020 genes that are unchanging, and use those for normalization, instead of letting the software figure it out.

ADD COMMENT
0
Entering edit mode

Thanks! It does seem to fit - I remember I had tried GAMLSS on this dataset and the NB models seemed to be the most accurate.

enter image description here

I don't use it as much but edgeR appears to have a similar issue:

enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 2535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6