In some specific situations, it is neither affordable nor feasible to have replicates for large RNA-seq analysis. Moreover, the analysis purpose is just hypothesis generation for further experiment design. So, is it possible to have duplicate samples for one or two of the groups and use one sample for other groups? Could we estimate a constant dispersion from these replicates and apply it to all of the samples for differential gene expression analysis.
I'm using EdgeR/Limma
for my differential gene expression analysis. When there are no replicates the eBays
function gives me an error. However, if I add replicate to just one of the samples, it won't give any errors. Does this mean that it uses the calculated dispersion from the duplicate condition for all of the other one sample conditions? Or is it a bug that it doesn't give an error
?
Should I consider this method of using duplicates for some samples instead of just using no replicates in such large experiments that are being done for hypothesis generation?
Cross-posted here
What do you mean by duplicate? Do you mean making a pretend bio replicate when you really don't have one?
I mean having two biological replicates for one condition. Not exactly pretending that would be using something like bootstraping which is good for techincal replicate generation. My intention is to get the average variabilty from the few duplicate samples and then assume that my other samples also have the same variability if they have been done in duplicates. In that case it is like having biological replicates for them too and using all the samples for DEG analysis.