Anyone have any advice for comparing the output of two RNA-seq runs? Basically, I'm comparing two conditions (control and treatment) from tumor samples. I have one run with 6 samples control/treatment and a second one that was done with 8 samples control/treatment. It's the same setup for both, but expecting some amount of batch effect not just because they're separate runs but because they're tumors which can be variable in their growth between experiments.
I don't necessarily need to combine them into one big experiment (is that even feasible?) but thinking about ways to present the data together. Should I make a Venn Diagram of hits from each run? Can I plot LFC vs. LFC, and would that even be helpful? Most of the analysis I already generated was with the 6-sample cohort, but since I also have this 8-sample cohort it seems a waste not to use the data. Looking for ideas on how to increase confidence in hits with the data and how to "show my work" with plots etc.
Edit: The combined analysis doesn't have to be "in" DESeq2 (that's what I used for each individual run), just looking for ideas.
No, the experimental setup was very similar - same treatment, same type of cells, etc.
Would this be OK even if the sequencing depth was different for each run? The run was the same type but the total reads/sample was about 30% lower in the second run.
Shouldn't matter, it'll normalize for sequencing depth and including the batch effect variable in your design should mitigate most other technical effects.
Slap 'em all together and look at the PCA. It'll tell you if there are real issues to try to deal with or not.
Yeah, ideally this is the case where differences in total counts don't make too much of a difference. If you have more than 30 M reads per sample, then you _should_ have sufficient coverage in every sample to avoid additional problems.
Ideally, everything would have been sequenced at the same depth in the same lane, but practically that's often not what happens. Those diagnostic plots should help identify if there's still a problem after including the
Run
in the design matrix.