Question

Can I pool samples treated with different sgRNAs when testing for KO effects in RNA-seq analysis?

0

Entering edit mode

2.4 years ago

GLG ▴ 10

I have RNAseq data for the following conditions (2 replicates for each):

DMSO-treated (control guide), Inhibitor-treated (control guide), DMSO + KO (two different sgRNAs), Treat. + KO (two different guides).

I have analyzed the data so far using DESeq and using this kind of grouping (by treatment and sgRNA):

metaData

                     treatment    sgRNA      genotype     grouped
 controlA_DMSO        DMSO        control         WT      control_DMSO
 controlB_DMSO        DMSO        control         WT      control_DMSO
 controlA_Treat.       1uM        control         WT      control_1uM
 controlB_Treat.       1uM        control         WT      control_1uM
 guide1A_DMSO         DMSO           1            KO      1_DMSO
 guide1B_DMSO         DMSO           1            KO      1_DMSO
 guide1A_Treat.        1uM           1            KO      1_1uM
 guide1B_Treat.        1uM           1            KO      1_1uM
 guide2A_DMSO         DMSO           2            KO      2_DMSO
 guide2B_DMSO         DMSO           2            KO      2_DMSO
 guide2A_Treat.        1uM           2            KO      2_1uM
 guide2B_Treat.        1uM           2            KO      2_1uM

And the following design:

dataSet <- DESeqDataSetFromMatrix(countData = counts, colData = metaData, design = ~ grouped)

Then I would get the results table for each contrast of interest, for example:

results <- results(DESeq, contrast = c("grouped","control_1uM","control_DMSO"), alpha = 0.05) ... etc

The thing is, we would be interested in the combined effect of KO of protein X + pharmacological inhibition of protein Y, however we have tried to knock-out X with two different guides, which yield different sets of DEG (guide 1 gives us more DEG, although I'm not sure if these are spurious results due to non-specific CRISPR cutting). enter image description here

For the above heatmap, I called the DEG for contrast = c("grouped","1_1uM","control_DMSO") and contrast = c("grouped","2_1uM","control_DMSO"), then got the common set of DEG from both results tables to do the plotting.

I was wondering if there would be a better approach to do the DEG analyses regarding to KO, and what kind of design and test (Wald or LRT) would be best. I thought of the following approaches, but I'd like to hear other people's thoughts on them:

Would it be best to just select one of the sgRNAs and do the analyses based on them (excluding samples that used the other guide)? If so, how could we look at our data to decide which guide gives the more confident/less spurious DEG results? (by the heatmap above, the two guides seem to be somewhat consistent, but they differ in a few subsets of genes)
How could I build the design formula in a way that tests for DEG related to KO and KO + treatment while controling for differences between the two sgRNAs used?
If I pool both guide RNAs by grouping by treatment and genotype (instead of treatment and sgRNA), like seem below, would the dispersion estimates and underlying statistical testing in DESeq2 be able to yield DEG more consistent with actual KO (while making the non-specific/spurious DEG from each guideRNA to have higher adjusted p-values, for example)?

     grouped
     WTDMSO
     WTDMSO
     WT1uM
     WT1uM
     KODMSO
     KODMSO
     KO1uM
     KO1uM
     KODMSO
     KODMSO
     KO1uM
     KO1uM

I was thinking of going with the latter option (pooling the samples treated with different guides), but I would like to hear what other people think.

CRISPR RNA-seq DESeq2 • 614 views

ADD COMMENT • link 2.4 years ago by GLG ▴ 10