I'm analyzing Illumina RNA-Seq data from three FADS1 knockout mice and three wild types. After running cuffdiff on the six samples, I see in genes.read_group_tracking that:
$ egrep "tracking_id|Fads1" genes.read_group_tracking | column -t | less -S
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
Fads1 null 0 1 0.880439 0.880439 0.0142817 - OK
Fads1 null 1 1 1.09692 1.09692 0.017661 - OK
Fads1 null 2 0 0 0 0 - OK
Fads1 wt 0 347 331.474 331.474 5.38128 - OK
Fads1 wt 1 251 246.101 246.101 3.96235 - OK
Fads1 wt 2 156 167.516 167.516 2.69709 - OK
yet in genes.fpkm_tracking and gene_exp.diff I see:
$ egrep "tracking|Fads1" genes.fpkm_tracking | cut -f 1,10-20 | column -t | less -S
tracking_id null_FPKM null_conf_lo null_conf_hi null_status wt_FPKM wt_conf_lo wt_conf_hi wt_status
Fads1 0.0106476 0 0.0303051 OK 4.01357 2.80322 5.21247 OK
$ egrep "q_value|Fads1" gene_exp.diff | cut -f 1,5-15 | column -t | less -S
test_id sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
Fads1 null wt OK 0.0106476 4.01357 8.55822 4.1466 0.2495 0.999688 no
Where it seems like the confidence intervals are clearly separated, yet the P-value is much higher than my intuition would expect. Any ideas of things to check?
For completeness the cuffdiff command was: cuffdiff -p 12 -o ./analysis/FADS1/cuffdiff -b /mnt/data1/refs/igenomes/Mus_musculus/UCSC/mm10/Sequence/WholeGenomeFasta/genome.fa -L null,wt /mnt/data1/refs/igenomes/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf <bams,from,null> <bams,from,wt>
I noticed a similar problem....did you solve this yet? Thanks!
We used edgeR and it seemed to detect the gene as DE without any problems. I don't have the p-value off-hand though.