Hi,
I have the same bam files from TopHat and I have used both cuffidd and HTseq(and then R package) to get DE genes. The problem is that the number of genes are not the same in HTseq and cuffdiff. I was expected not to get same number of counts for each gene but not same number of genes as the reference has been the same for both. I have more number of genes in HTseq output than in cuffdiff. Any idea about the reason would be appreciated.
Sorry, maybe I did not explain well. I am not talking about difference un the number of DE genes but the difference between the number of whole input genes for these methods. So I have 23368 genes after running HTseq (which is before running DESeq to get DE genes). But I have 23284 genes as the whole number of genes in the genes_exp.diff file from cuffdiff regardless of the fact how many of them are DE genes and how many are not. As I have used the same bam file and also the same gtf file for both I was expected to have same total number of genes but ofcourse because of different algorithms they have there should be different number of DE genes.
Well that does not actually change what I said.
It does not matter if at any given point in the analysis you have fewer or more inputs when following one method over another. The only things that matter are the end result. If by the end of the analyses choosing one method gives you a radically different answer than the other then there is reason to worry. Right now it is premature to fret over a difference of 16 genes.
A robust observation should be reproducible by different analysis methods.
Yes, now I understand thank you.