In RNA Sequencing, there are a lot of analysis we can perform. For example, a standard pipeline will involve:
- Alignment of the reads (STAR, RUM, TopHat, MapSplice, GSNAP etc)
- Gene Counting (HTSeq, Cufflink etc)
- Differential gene expression analysis (DESeq, EdgeR, Cuffdiff etc)
- Functional Annotation (DAVID)
- Pathway Analysis (SPIA, GSEA, WGCNA etc)
This set of analysis should more or less gives us certain biological idea of what is happening to our sample. However, in recent attempts to analyse our RNA Sequencing data, it was found that it is very difficult to interpret the differential exon usage results, or in general, differential isoform expression. A possible pipeline will be something like
Alignment -> Gene Counting -> Differential gene expression analysis (DEXSeq)
or
Alignment -> MATS (or other tools)
Alternatively We can also try De novo assembly with Trinity or OASIS to construct the transcriptome or just align to the transcriptome of our organism, then pipe to RSEM and the perform EBSeq analysis.
But what afterwards? After obtaining the differential gene expression, we can try different from of functional annotation, but it is much more difficult for us to perform functional annotation on the different isoforms or exon. When you have more than a 1000 differentially expressed exons, how do you know exactly what's wrong in your condition or what is happening?
tldr: What are your usual practice to explain the results from differential exon usage / differential isoform expression analysis especially when you don't have a defined set of candidate genes?
I'm rather interested in what others have to say about this question. My 2 cents is that we aren't generally in a position to use these results (differential exon/isoform usage specifically) to give any biological insight since, frankly, we often have no clue what the functional consequences of increasing the expression of some particular splice form (that itself has never been studied) of some gene that's only mentioned in passing in one paper from a decade ago.
I do agree with that. It seems like although we have realize this as an important question, we just don't have enough knowledge (e.g. database) for us to answer that. The point is, it seems like we are not completely blind to these information, for example, for the gene Ptk2 http://www.uniprot.org/uniprot/Q05397, it does provide information regarding different isoform. The problem is, when you have many of these differential isoforms, then it will become difficult to do manual search for each of them...
Along these lines, it'd be nice to see papers people have seen that do a good job of these type of analyses and interpretation.