I am attempting to compare 3 different bulk RNA-Seq datasets that used PolyA capture for their library preparations, though the specifics of the protocols are different as each dataset comes from a different group of authors and labs. In contrast to some other previous questions on the topic, my datasets consist of 3 groups of interest, each coming from a different study (each study only has data on only one group of interest). Can I reliably compare these datasets?
My strategy so far: I downloaded the FASTQ reads and processed them with Salmon (QC steps too). Then imported them into R and ran DESeq2, but the numbers of differentially expressed (DE) genes are outrageous. Around 30% of the genes in each group are DE compared to the reference. I also used Recount3 counts with DESeq2, and still obtained ~30% DE genes. Could these results be valid?
Thanks!
Cannot know from this description. However, a good option here would be meta-analysis - i.e. if the groups are concerningly different, and you don't think you can control for that, they derive a test statistic for each separately, then combine them at the end..