All samples should be used to assembly one result, my workflow:
1:
Trinity --seqType fq --max_memory 20G --samples_file sample_file.config --genome_guided_bam ref_sorted.bam --genome_guided_max_intron 10000 --CPU 60
Find assembled transcripts as: trinity_out_dir/Trinity-GG.fasta
sample_file.config
is a txt file; suppose have 4 samples and everyone has 3 replicates, sample_file.config
like below:
sample1 sample1-rep1 sample1-rep1_1.fq sample1-rep1_2.fq
sample1 sample1-rep2 sample1-rep2_1.fq sample1-rep2_2.fq
sample1 sample1-rep3 sample1-rep3_1.fq sample1-rep3_2.fq
sample2 sample2-rep1 sample2-rep1_1.fq sample2-rep1_2.fq
sample2 sample2-rep2 sample2-rep2_1.fq sample2-rep2_2.fq
sample2 sample2-rep3 sample2-rep3_1.fq sample2-rep3_2.fq
sample3 sample3-rep1 sample3-rep1_1.fq sample3-rep1_2.fq
sample3 sample3-rep2 sample3-rep2_1.fq sample3-rep2_2.fq
sample3 sample3-rep3 sample3-rep3_1.fq sample3-rep3_2.fq
sample4 sample4-rep1 sample4-rep1_1.fq sample4-rep1_2.fq
sample4 sample4-rep2 sample4-rep2_1.fq sample4-rep2_2.fq
sample4 sample4-rep3 sample4-rep3_1.fq sample4-rep3_2.fq
before run trinity assembly, align all the fq files to the elephant shark geome(hisat), then merge(samtools) all the sam format result to "ref_sorted.bam";
2: Transcript Quantification with salmon
trinityrnaseq-v2.11.0/util/align_and_estimate_abundance.pl \
--transcripts ./trinity_out_dir/Trinity-GG.fasta \
--seqType fq \
--samples_file sample_file.config \
--output_dir salmon_transcript_quantification \
--aln_method bowtie2 \
--thread_count 60 \
--est_method salmon \
--trinity_mode --prep_reference
3: DE analyse with DESeq2
trinityrnaseq-v2.11.0/Analysis/DifferentialExpression/run_DE_analysis.pl \
--matrix salmon_transcript_quantification/salmon.gene.counts.matrix \
--method DESeq2 \
--samples_file sample_file.config \
--contrasts contrasts.file \
--output Differential_Expression_Analysis
contrasts.file
like below(suppose sample1 vs sample2, sample3 vs sample4 for DE analyse):
sample1 sample2
sample3 sample4
reference: https://github.com/trinityrnaseq/trinityrnaseq/wiki
Did you try to run Trinity with in silico normalization to 50x coverage using all reads? I think, this would be the best solution if you don't have enough RAM.
Thank you for the advise, Shelkmike! I'll try to run it this way. Jaime