I've simulated RNA abundance with wgsim. The simulation itself was error free. There is a single factor in my experiment that looks like:
A1 A2 A3 B1 B2 B3
R1_101 113 113 113 13 11 9
R1_102 247 246 246 12 12 14
R1_103 20835 20915 20788 9973 9955 9973
A1, A2, A3 are the simulated replicates for the first level. B1, B2 and B3 are the simulated replicated for the second level.
As expected, the reads counts for each level are very close because it was an error-free simulation. The purpose of the experiment is to compare it with cuffdiffs (another differential package) in detecting log-fold changes.
Unfortunately, I run into an error in DESq2:
Error in estimateDispersionsFit(object, fitType = fitType, quiet = quiet) :
all gene-wise dispersion estimates are within 2 orders of magnitude
It looks like the package is unable to estimate a dispersion factor (most likely it's too small). However, I had no problem with cuffdiffs. Is there anything that I can do to make it work?
is there a specific reason why you simulated the replicates with so little variance? to me it doesn't seem realistic at all and that's why deseq2 is complaining. if you want to compare cuffdiff with deseq2, you should use data that is closer to real rna-seq. maybe you can use dispersion values from a sequencing experiment. this is for example done in compcodeR.
No particular reason. The purpose of my experiment is to compare log-fold changes. For example, if a gene has a log-fold ratio of 10. How would cuffdiff and DESeq2 react to it?
in my experience (I also performed a few such simulations), nothing beats the fold change estimates of deseq2. this is because of the moderation applied to lowly expressed genes. if you want to try it yourself, I'd suggest you to take a look at compcodeR. it's a very handy package for DEA comparisons.
I got this error when I forgot to normalize the size factors such that they are around 1.
Thank you for your reply.
I run the lines below but the result got no DEG, why? My data is real with three replicates.
Please open a new question, and read the DESeq2 manual. At no point it instructs to use the rlog as input for DESeq2, use raw counts, rlog is wrong and violates assumptions, in part because it is already on log scale. The entire code you use is custom, be sure to follow the manual precisely.