Hi everyone,
I guess this is more of a general query about interpretation of cuffdiff output, since (thanks to some help from biostars) I'm pretty happy that my analysis is running correctly. I hope someone will be happy to help (again) as I'm an RNA-seq newbie.
I have ran cuffdiff on a set of RNA-seq data, 2 conditions with 30 replicate samples each (drosophila). When I look at the splicing.diff output file, approx. 80-90% of the gene IDs have the status 'notest'. I consulted the manual here http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/#differential-expression-tests, which states that this means 'not enough alignments for testing'. Now I could be naive here, but this surprised me given the number of replicates I have, and the RNA data is good quality from my previous checks. I've seen an example splicing.diff output from a colleague's previous analysis, and they have a similar low proportion of successful tests. So is this a pretty normal output, or is there a problem I'm missing that could explain this?
Thanks in advance for your help!
This would seem to be pretty normal, since most splicing events don't occur in most cells/conditions/developmental stages (not to mention that most genes aren't expressed in most conditions). Have a look at a few of these positions in a few BAM files in IGV or another browser. This should confirm whether this is the case or not.
Thanks - this makes sense, good to know the output isn't too weird, and after having a look at some of the data it seems to make sense.