Biostars,
I am performing differential gene expression analysis between "control" and "treated" samples that differ 2-3 fold in their depth (control samples are half to one third in number of reads as compared to treated samples). If I perform DE analysis using the old Tuxedo protocol, I do not observed many differentially expressed genes. Not even those that have been used for sample validation before subjecting them for sequencing.
If I load Bigwig
files (relatively better normalized) for these samples onto the genome browser, I can see expected difference in reads on the genes of interest. In order to normalize samples for the depth of sequencing, I am trying Samtools view -s
to subset the .bam files of samples to similar sizes. But these subset files ain't compatible with Cufflinks
since they lack the EOF marker.
I am wondering if such normalization is a good idea and if yes, how to get around this problem of incompatibility with Cufflinks
.
Thanks a lot for your help in advance!
I would expect that htseq-count/featurecounts followed by DESeq2/edgeR/limma-voom would be able to deal with this difference in depth, but that's not what you ask for.
The old tuxedo pipeline isn't considered "the best tool in the shed" anymore.
Okay. To make things worse, there is only one sample per group (no replicates).
limma-voom
cannot calculateCommon Dispersion
and henceTag Dispersion
for this reason. There might be a way out, but whats the best option of the three?We can test differentially expressed genes that will come out of the analysis but biological replicates of RNA-Seq are not possible for now.
Thanks for the help!
If you have unreplicated data then all of the presented options are equally crappy. GPower is supposed to be slighty better, but honestly you'd be better off not wasting your time on this dataset.
Okay. Thanks a lot for all the help.