I have got two RNA-Seq data sets to analyze, which were made by the same person, but a year apart. I was surprised to see such a big difference between the two data sets. Is it even possible to compensate for that?
As I was analyzing them, I also added the time difference effect into the mix. Unfortunately, I don't see any improvement in the correlation of the two data sets.
The analysis code and the before-After PCA are added below.
I would like to know, if this mean, I can't analyze the two data sets together, or this only mean to be careful when interpreting the results
As I am using DESeq2, would it maybe be better to use a different tool to compensate for this bias?
cts <- inner_join(rownames_to_column(cts_p203), rownames_to_column(cts_p405), by = "rowname" ) |> column_to_rownames("rowname")
...
coldata <- data.frame(
name = paste0(rep(x = c("Wt_p203_", "WT_p405_", "Cys_", "Thr_", "Arg_", "Leu_"), each =3), rep(1:3, times = 3) ),
condition = c(rep("control", times = 6), rep(c("Cystein", "Threonin", "Arginin", "Leucin"), each = 3)),
batch = c(rep("p203", times =3), rep("p405", times = 9), rep("p203", times = 6) )
)
dds <- DESeqDataSetFromMatrix(countData = cts,
colData = coldata,
design = ~ batch +condition)
dds
keep <- rowSums(counts(dds) >= 10) >= 3
dds <- dds[keep,]
vsd <- vst(dds, blind=FALSE)
...
pcaData <- plotPCA(vsd, intgroup = c("condition", "batch"), returnData = TRUE)
ggplot(pcaData, aes(x = PC1, y = PC2, color = condition, shape = batch)) +
geom_point() +
scale_color_manual(values = c25[15:19]) +
scale_alpha_manual(values=c(0.3, 0.7), guide=F) +
geom_text_repel(aes(PC1, PC2, label = pcaData$name),size = 2) +
xlab(paste0("PC1: ", percentVar[1], "% variance")) +
ylab(paste0("PC2: ", percentVar[2], "% variance")) +
ggtitle("PCA with vst-transformed data, colored by conditions") +
labs(shape = "Group", color = "Condition" )
Thanks
Same kit/method of extraction/lib prep/sequencer (2- or 4- color)/length of reads?
as far as I know yes. It was a long time ago, and the person responsible for it is unfortunately already gone. but the core facility says it is the case