Differential expression analysis with RNA-Seq samples that vary in depth
1
0
Entering edit mode
7.9 years ago
Satyajeet Khare ★ 1.6k

Biostars,

I am performing differential gene expression analysis between "control" and "treated" samples that differ 2-3 fold in their depth (control samples are half to one third in number of reads as compared to treated samples). If I perform DE analysis using the old Tuxedo protocol, I do not observed many differentially expressed genes. Not even those that have been used for sample validation before subjecting them for sequencing.

If I load Bigwig files (relatively better normalized) for these samples onto the genome browser, I can see expected difference in reads on the genes of interest. In order to normalize samples for the depth of sequencing, I am trying Samtools view -s to subset the .bam files of samples to similar sizes. But these subset files ain't compatible with Cufflinks since they lack the EOF marker.

I am wondering if such normalization is a good idea and if yes, how to get around this problem of incompatibility with Cufflinks.

Thanks a lot for your help in advance!

RNA-Seq Hisat2 Cufflinks Depth of sequencing • 2.4k views
ADD COMMENT
2
Entering edit mode

I would expect that htseq-count/featurecounts followed by DESeq2/edgeR/limma-voom would be able to deal with this difference in depth, but that's not what you ask for.

The old tuxedo pipeline isn't considered "the best tool in the shed" anymore.

ADD REPLY
0
Entering edit mode

Okay. To make things worse, there is only one sample per group (no replicates). limma-voom cannot calculate Common Dispersion and hence Tag Dispersion for this reason. There might be a way out, but whats the best option of the three?

We can test differentially expressed genes that will come out of the analysis but biological replicates of RNA-Seq are not possible for now.

Thanks for the help!

ADD REPLY
2
Entering edit mode

If you have unreplicated data then all of the presented options are equally crappy. GPower is supposed to be slighty better, but honestly you'd be better off not wasting your time on this dataset.

ADD REPLY
0
Entering edit mode

Okay. Thanks a lot for all the help.

ADD REPLY
1
Entering edit mode
7.9 years ago

cuffdiff does an appropriate normalization (the same one as DESeq2, if I recall correctly) internally, so please don't subsample. Having said that, as WouterDeCoster wrote, you're strongly encouraged to not use cufflinks/cuffdiff, but rather one of the standard R-based tools.

ADD COMMENT
0
Entering edit mode

Thank you! How about the new Tuxedo pipeline? The Ballgown seem to rely on countMatrix. For small sample sizes (n < 4 per group), Balldown recommends regularization using the limma anyway.

Best

ADD REPLY
0
Entering edit mode

I've never used it, but given who wrote Ballgown it should be much better.

ADD REPLY

Login before adding your answer.

Traffic: 1638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6