Dear all!
I tried to use trinity to assemble my RNA-seq raw data and find some differential expression genes between control and experiment groups. Two RNA-seq samples without biological replicates were contained in my analyses, one is control and another is experiment. As a newer of NGS analyses, I followed the protocol of trinity website (http://trinityrnaseq.github.io/analysis/diff_expression_analysis.html). However, only 40 DE genes were identified and our target upregulation gene was not contained in these genes. I pasted my command here and ask for you help me to find some unsuitable treatment. Thanks a lot!
Note: I have two samples and four fastq files
$ align_and_estimate_abundance.pl \
--transcripts Trinity.fasta \
--seqType fq \
--left sample1_1.fastq \
--right sample1_2.fastq \
--est_method RSEM \
--aln_method bowtie \
--SS_lib_type RF \
--thread_count 24 \
--trinity_mode \
--prep_reference # Align reads to reference through bowtie software;
After this step, we obtained two isoforms.results
and two gene.results
files, we used genes.results
file in further analyses which named as sample1.genes.results
and sample2.genes.results
$ abundance_estimates_to_matrix.pl \
--est_method RSEM \
sample1.genes.results sample2.genes.results
$ run_DE_analysis.pl \
--matrix matrix.count.matrix \
-method edgeR
$ cd edgeR ...dir/
$ analyze_diff-expr.pl \
--matrix /home/.../Trinity_trans.TMM.fpkm.matrix \
-P 1e-3 \
-C 2
Why does it surprise you, that DE analyses without replication yield a low number of significant genes? Sometimes I think that succumbing to the pressure from 'experimentalists' to implement magic tricks for providing some sort of 'p-values' out of non-replicated experiments was a really bad idea, top candidate for What Are The Most Common Stupid Mistakes In Bioinformatics?