I performed differential expression analysis with DESeq, edgeR and cuffdiff on my data. Surprisingly, there are differences between DESeq, edgeR and cuffdiff. Here's a venn diagram of my results. I've 1001 DE gene with DESeq, 1447 DE gene with edgeR and "only" 149 DE gene with cuffdiff. Anyone can explain why cuffdiff is so stringent ?
You should specify what versions of each tool you're using. The methods are sometimes updated between versions (especially Cufflinks, which has seen significant updates recently).
Cufflinks tries to identify the abundance of different transcripts within a sample. Given biological replicates, edgeR and DESeq try to identify if the observed difference between two conditions can be attributed significantly due to the experimental condition alone and not due to the biological variance. So, I don't understand what you call as differentially expressed between edgeR, DESeq and Cufflinks. You should explain your setup, experimental conditions, biological replicates and what is it you have obtained using cufflinks. You mean cuffdiff?
Its very important to know the question you are asking while conducing a statistical study. For example, cuffdiff tries to answer the question if there is a difference in a transcript expression in between two samples. edgeR and DESeq tries to answer if the difference in total expression (total count of a gene including all its isoforms) you see between samples is solely due to that alone and not due to biological variability.
Secondly, the differences I recall between edgeR and cufflinks are between the equation for variance and estimation of dispersion parameters. Given that you have biological replicates, then I would expect more genes to be similar than what you have obtained. If you ran the test without replicates, then it is relatively difficult to tell much about it as there is only so much information you have provided for the packages to estimate the dispersion parameters. DESeq tries to identify this by comparing other genes with similar expression pattern in case of no replicates, if I am not mistaken.
I would recommend filling in some gaps with regard to your setup to obtain (more) meaningful and helpful answers.
I mean cuffdiff of course. I've a group of three samples (control) and eight samples (treated). I perfored DE analysis using these thress tools and compared the results.
You should specify what versions of each tool you're using. The methods are sometimes updated between versions (especially Cufflinks, which has seen significant updates recently).
yeah, between the Cufflinks versions there are a lot differences!... see this post: http://gettinggeneticsdone.blogspot.com/2012/04/rna-seq-methods-march-twitter-roundup.html
the latest version for all of the I think : edgeR2.6.7 DESeq1.8.3 cufflinks 2.0.0
that is interesting. what is the overlap between DESeq and EdgeR?
intersection DESeq and edgeR : 938 genes. so very good intersection