Hi all,
Hoping that some of the RNAseq experts in here are having some pieces of advice on normalization/proceeding of following data analysis:
The study is about how a stressor is affecting a non-model species (no reference genome/transcriptome available) in terms of differential gene expression. We did a deep sequencing to make a reference transcriptome and sequenced the samples at a "lower depth":
Reference transcriptome de novo assembled (Trinity) from reads with a sequencing depth of 300 M (PE 2x150 nt). The statistics (TrinityStats), E50N90, BUSCO analysis, Blast2Go, Detonate (comparison of 3 assemblies - chose the best one) looks good. The reference is made from a non-stressed individual of the non-model organism.
Triplicate samples of "negative stress control", "positive stress control" and the "treatment" with a sequencing depth of 25M (PE 2x75nt)
How is the best way to use the reference transcriptome in order to determine differential gene expression of the samples? any tips/tricks on how to normalize?
Thank you!
I'm no expert, but you could use your reference transcriptome to map reads of your treatments and obtain counts (kallisto, or you can take the single mapping reads as counts I think), however, you will be missing out on all the isoforms specific to that treatment. You can normalize using edgeR's TMM method ( an explanation here), and I am pretty sure the way from there to determine differential expression is pretty standard (maybe look at edgeR's vignettes?).
PS- Is it E50N90 or E90N50?
Ops, I meant E90N50 :-)
Thank you! I have considered kallisto as well
The Trinity wiki provides a lot of guidance for exactly what you want to do: https://github.com/trinityrnaseq/trinityrnaseq/wiki/Post-Transcriptome-Assembly-Downstream-Analyses
Since you will be aligning to your transcriptome you will want to rescue multi-mapped reads. The Trinity developers recommend Kallisto, Salmon or RSEM.
As a personal note, my workflow is to map to the assembly using bowtie, estimate abundance using RSEM and then normalization and differential testing using edgeR's TMM method.