Question

differential expression analysis without genome

0

Entering edit mode

5.2 years ago

mxlsherry1992 ▴ 80

Dear all,

When I did the differential expression analysis for the RNA seq data (no reference genome), after Trinity assembly to got Trinity.fasta, the next step should be align_and_estimate_abundance.pl to map the reads for each samples to the Trinity.fasta file. But the question is that is we need to do cd-hit to remove the redundancy for the Trinity.fasta first, and then do align_and_estimate_abundance.pl ?

Thank you!

RNA-Seq • 1.1k views

ADD COMMENT • link updated 5.2 years ago by h.mon 35k • written 5.2 years ago by mxlsherry1992 ▴ 80

score 0 · Answer 1 · 2019-09-30

But the question is that is we need to do cd-hit to remove the redundancy for the Trinity.fasta first

One could, but there are probably better options to reduce redundancy from the assembly - have a look at the Trinity FAQ. I would consider first:

Using the supertranscripts method, which will produce a genome-like gene representation of the transcriptome assembly, you can then follow up with Differential Transcript Usage via SuperTranscripts.

You can also use the Trinity.fasta.gene_trans_map generated by Trinity to get "gene" counts in addition to the transcript counts.

After quantifying transcript abundance, filtering transcripts with low counts (can be applied in conjunction with the two methods above).