Hello,
I have a RIP-seq experiment with 3 replicates per condition. When following the classic RNA-seq analysis pipeline and after quantification with Salmon I've done a transformation and compared the correlation between replicates. This is the code :
sample_info_12DD <- DataFrame(
condition= c('12DD','12DD','12DD',
'input','input','input'),
row.names = c("12DD_rep1", "12DD_rep2", "12DD_rep3",
"input_rep1", "input_rep2", "input_rep3"))
dds_12DD <- DESeqDataSetFromTximport(txi_12DD,colData = sample_info_12DD,design = ~ condition)
#here we remove genes with 0 reads in all samples
keep_genes <- rowSums(counts(dds_12DD)) > 0 #TRUE if the gene has more that 0 reads, if not : FALSE
dds_12DD <- dds_12DD[ keep_genes, ]
#cpm filtering
keep <- rowSums(cpm(counts(dds_12DD)) > 1) >= 3
dds_12DD<- dds_12DD[keep , ]
dds_12DD <- DESeq(dds_12DD)
#rlog transformation
rld <- rlog(dds_12DD)
#vst transformation
vst<- vst(dds_12DD)
df <- bind_rows(as.data.frame(log2(counts(dds_12DD, normalized=TRUE)+1)) %>%
mutate(transformation = "log2(x + 1)"), as.data.frame(assay(vst)) %>% mutate(transformation = "vst"), as.data.frame(assay(rld)) %>% mutate(transformation = "rlog"))
lvls <- c("log2(x + 1)", "vst", "rlog")
df$transformation <- factor(df$transformation, levels=lvls)
The results :
PS : when tryin the 3rd replicate and also other conditions, it has the same correlation profil
My question is : can I pursue Differential gene analysis knowing that I have a weird correlation between replicates ?
As
i.sudbery
said below, to me nothing looks too odd either. Would you be able to specify roughly what you mean by weird? That might help narrow down the answer to something more specific (and also help other people if they encounter similar looking plots and have the same question)