Question

Understanding DEXSeq results

0

Entering edit mode

3.2 years ago

Assa Yeroslaviz ★ 1.9k

I am trying to understand the DEXSeq output file. This is for example the first line I have in the file.

    groupID featureID   exonBaseMean    dispersion  stat    pvalue  padj    input   IP  log2fold_IP_input   genomicData.seqnames    genomicData.start   genomicData.end genomicData.width   genomicData.strand  countData.1 countData.2 countData.3 countData.4 countData.5 countData.6 transcripts
WBGene00006062:E003 WBGene00006062  E003    20.97392794 0.010368532 166.1550712 5.12E-38    3.90E-34    8.567627289 0.025351334 -17.52140343    I   9503439 9504354 916 +   81  61  96  0   0   0   c("F30A10.8.2", "F30A10.8.1")

I was wondering, why there is such s big difference between the input/ip values and the counts values of the samples. If i understood it correctly, these are the normalized values after the dispersion estimation. But how are they calculated. Are they log-transformed. I have tried to read the original paper, but didn't really get it.

Also from looking at the results, the overlapping elements are mostly non-coding RNAs. Would it make sense to remove them before the analysis. Meaning to remove the non-coding genes from the gtf file before mapping, or does it make no difference in terms of normalization/differential expression results.

thanks

dexseq normalization counts • 567 views

ADD COMMENT • link 3.2 years ago by Assa Yeroslaviz ★ 1.9k