I have a large time series data set with multiple conditions for which I'm performing RNAseq and the DGE using DESEQ2. With 100+ samples, I wasn't able to process them all at the same time. So I have several batches.
When adding all of the parameters to DESEQ2 (~run + batch + strain + minute + strain:minute), I get model matrix is not full rank.
To solve the issue, I run the solution given by DESEQ2 developer of nested conditions, etc.
It seems like whatever I do, my model matrix is not full rank. I will try to find a biostatistician to work with, but in the meantime, I was wondering if its a good idea to use another software to do batch corrections and then move to DESEQ2? What is the consensus? Do folks have a specific software they prefer when they do this?
Please post the colData (a meaningful subset of it), so one can have a look. Almost certainly there is a suboptimal encoding of your variables.
In general, putting samples of the same batch on different instrument runs will not cause technical artifacts. If that is what you mean by "run", you should be able to omit that from your design.
There is no clever trick that will help you if your batches are confounded with your experimental variables.