Question

batch correction in DESeq2

0

Entering edit mode

13 months ago

Nelo ▴ 20

Hi!

I have performed DESeq2 using below colData:

sample  treatment
M24_virus_rep1  treated
M24_virus_rep2  treated
M24_virus_rep3  treated
M0_controlrep1  control
M0_controlrep2  control
M0_controlrep3  control

with code:

colData <- read.delim("colData.csv", header= TRUE,sep = ",")
aqp_counts<- counts[aqps, ]
aqp_dds <- DESeqDataSetFromMatrix(countData = aqp_counts, colData = colData, design = ~ treatment)
aqp_dds <- DESeq(aqp_dds)
aqp_res <- results(aqp_dds)
aqp_res_ordered <- aqp_res[order(aqp_res$padj),]
aqp_de_genes <- rownames(aqp_res_ordered)[which(aqp_res_ordered$padj <= 0.01 & abs(aqp_res_ordered$log2FoldChange) >= 1)]
aqp_vsd <- varianceStabilizingTransformation(aqp_dds, blind=T)
pca <- plotPCA(aqp_vsd, intgroup = "treatment")
pca

But my PCA plot is not cluster separately according to biological differences of interest. Instead some of the control samples got mixed with treated samples. So, before going downstream analysis, I'm thinking about doing batch correction. Can someone help with it please

DESeq2 • 986 views

ADD COMMENT • link updated 13 months ago by Ram 44k • written 13 months ago by Nelo ▴ 20

0

Entering edit mode

Please read the vignette which covers this. If you have questions then show plot and what you've tried. Right now "help me please" is open-ended and cannot be answered.

ADD REPLY • link 13 months ago by ATpoint 85k

0

Entering edit mode

enter image description here

ADD REPLY • link updated 13 months ago by ATpoint 85k • written 13 months ago by Nelo ▴ 20

0

Entering edit mode

this is the plot here

ADD REPLY • link 13 months ago by Nelo ▴ 20

0

Entering edit mode

this plot does not match the metadata from above, having 3 treated vs 3 untreated, here you have 6 treated vs 2 untreated

ADD REPLY • link 13 months ago by ATpoint 85k

0

Entering edit mode

metadata I provided above was intentionally changed to post here, but later you asked for plot as well. So give you the plot as such. The original colData is as follow:

sample  treatment
M24_virus_rep1  treated
M24_virus_rep2  treated
M168_virus_rep1 treated
M168_virus_rep2     treated
M72_virus_rep1  treated
M72_virus_rep2  treated
M0_control1 control
M0_control2 control

ADD REPLY • link 13 months ago by Nelo ▴ 20

score 2 · Answer 1 · 2023-10-31

2

Entering edit mode

13 months ago

swbarnes2 14k

Was the experiment really planned so poorly that you have batches between only 8 samples?

Sorry, but the simplest explanation is that one of those viruses isn't changing gene expression much in the genes you are looking at int he tissue you are looking at.

ADD COMMENT • link 13 months ago by swbarnes2 14k