I have a set of RNAseq data from 4 wt, 10 hets and 2 ko. I am trying to see if more replicates in hets leads to more degs in het vs ko than wt vs ko. Hence, I am here trying to take only 4 random replicates of hets from the DeseqDataSet and run analysis multiple times.
In each run, only difference is which hets are in the DESeqDataSet. All runs have have the same wt and het. Even then, my DEGs for wt vs ko are different, in each run. How is that?
random_het <- function(dds_lcr){
minusHet <- sampleTable %>% filter(genotype %in% "KO.WT") %>% select(embryo) %>% distinct() %>% slice_sample(n =6) %>% as.list %>% unlist %>% paste0(., collapse = "|")
dds_4het <- dds_lcr[, -grep(minusHet, colnames(dds_lcr))]
dde_4het <- DESeq(dds_4het)
WtVsKo4het <- results(dde_4het, name = "genotype_WT_vs_KO", lfcThreshold = 0, alpha = 0.05)
HetVsKo4het <- results(dde_4het, name = "genotype_KO.WT_vs_KO", lfcThreshold = 0, alpha = 0.05)
sum.data <- map(list(WtVsKo4het, HetVsKo4het), getSig, padj = 0.05) %>% map(., nrow) %>% as.data.frame %>% set_names(c("WtVsKo4het", "HetVsKo4het"))
names(sum.data) <- c("WtVsKo4het", "HetVsKo4het")
data <- list(sum.data, embyos = colData(dds_4het))
return(data)
}
randomise_plot <- function(n){
randomiseHet <- purrr::rerun(n, random_het(dds_lcr))
df_dge <- sapply(randomiseHet, "[[", 1) %>% t %>% as.data.frame
plot(df_dge)
p1 <- recordPlot()
dds_names <- sapply(randomiseHet, "[[", 2)
return(list(p1, dds_names))
}
randomise_plot(n=10)
The plot shows in two runs there are different number of DEGs for both comparison. plot
Thank you. That makes perfect sense.