Hello,
I have a time-course RNA-seq dataset with two biological replicates per condition. To identify genes that are DE throughout time, I used DESeq2 and an LRT test. Now, I would like to identify group of genes that behave similarly throughout time using DEGreport::degPatterns(). As input I use significantly DE genes (as identified by LRT) and the rlog-normalized count matrix. This results in many clusters with some clusters showing very similar patterns and containing only few genes. Increasing the cut-off for genes present in a cluster for it to be reported does not improve this much. Considering that my replicates behave very similar (assessed by PCA and hclust), I was thinking about merging them using DESeq2::collapseReplicates(). Mike Love's reply to a question on the Bioconductor forum reassures me that this might be a good idea (https://support.bioconductor.org/p/114506/). My concern now, however, is regarding the size factors calculated upon collapsing the replicates. From what I understand, the size factor from the first replicate is used. Is this correct? Or are the size factors recalculated for the now merged counts during the rlog-transformation?
I am thankful for any comment or suggestion :)
True! Argh I was overcomplicating things. Thanks a lot!
No worries, happens to all of us once in a while :)