Entering edit mode
7.3 years ago
ren.yingxue
•
0
Hi, I want to use DEXSeq to identify differential exon usage between ~800 cancer vs 90 normal samples from TCGA. I have a matrix of raw exon counts downloaded from TCGA in the format below:
Exon TCGA-66-2742-01A TCGA-L4-A4E5-01A TCGA-86-8671-01A TCGA-77-8008-01A
chr1:11874-12227:+ 2 7 15 1
chr1:12595-12721:+ 1 1 3 0
chr1:12613-12721:+ 1 1 3 0
The problem is that DEXSeq requires to use dexseq_count.py to perform exon counts for each sample using bam/sam as input, and then combine these files into an object "countFiles", which will then be used in the DEXSeq function "DEXSeqDataSetFromHTSeq". Since I only have a matrix of raw exon counts and don't have individual BAM/SAM, I wonder how I can still use DEXSeq. Any suggestions are appreciated! Thank you!
Why don't you use
dexseq_count.py
then ?thank you for the comments. I cannot use dexseq_count.py because I don't have the BAM files for the ~890 samples. I only have a matrix of raw exon counts downloaded from TCGA.
To identify differential exon usage, you need to know which is the parent gene name for different exons. If you have that information, you can convert your matrix to the DEXSeq count format.
Did you ever figure out how to do it? I'm trying to do the same thing