I've downloaded different bam files (RNA-seq) that corresponds to tumor. I have the coordinates for exons of interest, and I would like to find the differential exon usage over those bam files.
I would like to know the best way to do that. Is DEXSeq the best option?
If you are looking at differential exon usage (for patient characteristics, tumor-normal, etc.), DEXSeq is a popular option.
I happen to like JunctionSeq (which is based upon DEXSeq), which also includes splice sites (known junctions, and novel junctions between known exons) and an overall p-value for the gene. Sometimes, I could define known differences with JunctionSeq but not DEXSeq.
In both cases, you'll need replicates. If you only had one sample you could try a de novo assembly like cufflinks, but I've found having a merged assembly with multiple samples important for getting robust results (although having split bam files for a high-coverage sample might also be OK).
Let's say that I have bam files for 3 different treatments and I would like to know the exon usage over a defined exon regions (Let's say exon.bed file). As I downloaded the bam files for a web server, I don't have replicates for each bam file (but apparently, the number of reads is high in each one). As DEXSeq requires as input a count matrix on the exon level, can I create it from my exon.bed file?.Because I don't know exactly how to create it... Otherwise, as I really want to know is the numer of exons included in the bam files, could I run:
DEXSeq uses a specialized scripts (dexseq_prepare_annotation.py and dexseq_count.py) to run HTSeq for exon counts. The manual mentions that you can use other read counting programs, but I would typically use those specifically designed for DEXSeq (using a GTF annotation file).
JunctionSeq uses QoRTs for exon and junction counts.
I would recommend IsoformSwitchAnalyzeR ( http://bioconductor.org/packages/IsoformSwitchAnalyzeR/ ) which enables statistical identification of isoform switches with predicted functional consequences where the consequences can be chosen from a long list but includes gain/loss of protein domains, signal peptides changes in NMD sensitivity etc.
It requires that you run RSEM or Salmon (probably the better option and much faster) to quantify transcript/isoform expression first - but IsoformSwitchAnalyzeR directly supports both RSEM and Salmon outout.
Let's say that I have bam files for 3 different treatments and I would like to know the exon usage over a defined exon regions (Let's say exon.bed file). As I downloaded the bam files for a web server, I don't have replicates for each bam file (but apparently, the number of reads is high in each one). As DEXSeq requires as input a count matrix on the exon level, can I create it from my exon.bed file?.Because I don't know exactly how to create it... Otherwise, as I really want to know is the numer of exons included in the bam files, could I run:
Many thanks in advance!
Thank you!
DEXSeq uses a specialized scripts (dexseq_prepare_annotation.py and dexseq_count.py) to run HTSeq for exon counts. The manual mentions that you can use other read counting programs, but I would typically use those specifically designed for DEXSeq (using a GTF annotation file).
JunctionSeq uses QoRTs for exon and junction counts.