Hi,
From TopHat 2 alignment, I have obtained outputs containing splice junctions and the corresponding read counts. I am interested in calculating differential splice junction usage between wild type and mutant samples.
In a recent publication (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4358997/ ), the DEXSeq function called testForDEURT
was used to quantify differential splice junction usage between two samples. However, I am not able to figure out the important steps to use this function.
I have wild type and mutant samples in replicates, and therefore some statistics, for example, p-value and FDR, will be needed to define significantly candidates.
Could somebody please guide me doing this analysis. Any response is highly appreciated.
The only actual important steps that are different from the normal DEXSeq workflow are (1) filtering the splice site counts (they mention the filter used in the methods) and (2) annotating splice sites with gene information to allow parsing by DEXseq. You'll have to write something to do (2). (1) can likely be done with awk.
To me, the implementaiton of DEXSeq to calculate differential splice junction usage is not straightforward. Could you please explain me in more detail. I managed to prepare my read count (for splice junctions) file in the format of DEXSeq, but I am not able to figure out how the corresponding GTF file should be prepared.