de novo RNA seq, combining left and right reads to create assembly
1
0
Entering edit mode
8.6 years ago

Hello all,

I am performing de novo transcriptome assembly on my paired end, illumina reads. I am going to create a transcriptome assembly using trinity, however, how do i go about concatenating my left and right reads? Do i need to merge together each left and right reads first and then somehow combine all of my samples? I am not familiar with the unix code that will do this.

Thanks for the help! Nikelle

assembly RNA-Seq trinity transcriptome • 3.7k views
ADD COMMENT
0
Entering edit mode

Why are you looking to concatenate/merge your R1/R2 reads? No point in concatenating them and they can't be merged unless you know that the insert size will allow you to do so (i.e. sequencing length > insert size).
As Trinity page recommends this is all you need to do

Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G
ADD REPLY
0
Entering edit mode

Thank you,

So as Damian Kao said below me, should i be inputting all of my left reads into that trinity code, and then all of my right reads as well?

ADD REPLY
0
Entering edit mode

As @Damian pointed out in the example below make sure they are in the same order for --left and --right. It would also help if they have reads in same order (if you did any trimming then hopefully you used a PE aware trimmer) in each pair of files.

ADD REPLY
0
Entering edit mode

By the way, Trinity will only use PE reads as extra information for bundling reads during the Chrysalis stage (unless something has changed in the newer versions that I am not aware of). It doesn't attempt do any scaffolding with them.

ADD REPLY
0
Entering edit mode

Thanks! And the code below, is it correct that there are no spaces before and after adding in a comma?

Nikelle

ADD REPLY
3
Entering edit mode
8.6 years ago

You can give trinity a list of files separate by commas. For example:

Trinity --left left_1.fq,left_2.fq,left_3.fq --right right_1.fq,right_2.fq,right_3.fq --CPU 6 --max_memory 20G....
ADD COMMENT
0
Entering edit mode

thank you! and so this will generate one assembly using all of those left and right reads?

ADD REPLY
0
Entering edit mode

yes that command will allow you to use all the fastq files listed

ADD REPLY
0
Entering edit mode

Great, thanks very much.

ADD REPLY
0
Entering edit mode

Hi Damian,

So I ran my trinity code in screen so that it could run in the background.

Trinity version: v2.0.6
-ERROR: couldn't run the network check to confirm latest Trinity software version.

Wednesday, April 27, 2016: 15:04:21     CMD: java -Xmx64m -jar /usr/local/bin/trinityrnaseq-2.0.6/util/support_scripts/ExitTester.jar 0
Wednesday, April 27, 2016: 15:04:21     CMD: java -Xmx64m -jar /usr/local/bin/trinityrnaseq-2.0.6/util/support_scripts/ExitTester.jar 1
Wednesday, April 27, 2016: 15:04:21     CMD: mkdir -p /home/richardsonlab/AMMA_transcripts/allsampletrinitysassembly
Wednesday, April 27, 2016: 15:04:21     CMD: mkdir -p /home/richardsonlab/AMMA_transcripts/allsampletrinitysassembly/chrysalis

-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------

Converting input files. (in parallel)Wednesday, April 27, 2016: 15:04:21        CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/npetrill/Trimmomatic-0-3.35/pairedoutput1 >> left.fa 2> /home/npetrill/Trimmomatic-0-3.35/pairedoutput1.readcount
Wednesday, April 27, 2016: 15:04:21     CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastooNIKELLEs-MacBook-Pro:~ nikellepetrillo$ 
t.fa 2> /home/npetrill/Trimmomatic-0-3.35/R1V29pairedoutput30_1.readcount
Thread 2 terminated abnormally: Error, cmd: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastoolNIKELLEs-MacBook-Pro:~ nikellepetrillo$ 
/home/npetrill/Trimmomatic-0-3.35/pairedoutput2.readcount  died with ret 256 at /usr/local/bin/trinityNIKELLEs-MacBook-Pro:~ nikellepetrillo$ 
packet_write_wait: Connection to 172.16.45.3: Broken pipe

It looks like I'm getting back some errors... Do you know what to make of this? Is Trinity still running/how do i check this?

ADD REPLY
0
Entering edit mode

I think one of your input files is possibly named wrong. Does this file exist?

/home/npetrill/Trimmomatic-0-3.35/pairedoutput1

The Trinity Google groups might know more about these errors: https://groups.google.com/forum/#!forum/trinityrnaseq-users

ADD REPLY
0
Entering edit mode

Thanks for your quick reply. Yes it does exist. Maybe I will try that group!

ADD REPLY
0
Entering edit mode

Can you post the command you actually used? Perhaps there is something wrong there.

ADD REPLY
0
Entering edit mode

Its very long! But here it is:

--seqType fq --left /home/npetrill/Trimmomatic-0-3.35/pairedoutput1,/home/npetrill/Trimmomatic-0-3.35/R1V29pairedoutput30_1,/home/npetrill/Trimmomatic-0-3.35/R1V34pairedoutput30_1,/home/npetrill/Trimmomatic-0-3.35/R1V39pairedoutput30_1,/home/npetrill/Trimmomatic-0-3.35/R1V42pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR1V49/R1V49pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR1V60/R1V60pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR1V6/R1V6pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V13/R2V13pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V18/R2V18pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V1/R2V1pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V33/R2V33pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V46/R2V46pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V57/R2V57pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V59/R2V59pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V5/R2V5pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR2V62/R2V62pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V14/R3V14pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V16/R3V16pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V20/R3V20pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V26/R3V26pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V31/R3V31pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR3V54/R3V54pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V14/R4V14pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V23/R4V23pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V25/R4V25pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V54/R4V54pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V58/R4V58pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V6/R4V6pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR4V9/R4V9pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V19/R5V19pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V28/R5V28pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V33/R5V33pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V4/R5V4pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V54/R5V54pairedoutput30_1,/home/richardsonlab/AMMA_transcripts/30trimmoR5V8/R5V8pairedoutput30_1 --right /home/npetrill/Trimmomatic-0-3.35/pairedoutput2,/home/npetrill/Trimmomatic-0-3.35/R1V29pairedoutput30_2,/home/npetrill/Trimmomatic-0-3.35/R1V34pairedoutput30_2,/home/npetrill/Trimmomatic-0-3.35/R1V39pairedoutput30_2,/home/npetrill/Trimmomatic-0-3.35/R1V42pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR1V49/R1V49pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR1V60/R1V60pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR1V6/R1V6pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V13/R2V13pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V18/R2V18pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V1/R2V1pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V33/R2V33pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V46/R2V46pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V57/R2V57pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V59/R2V59pairedoutput30_2,/home/richardsonlab/AMMA_transcripts/30trimmoR2V5/R2V5pairedoutput30_2 --CPU 6 --max_memory 60G --output /home/richardsonlab/AMMA_transcripts/allsampletrinitysassembly
ADD REPLY
0
Entering edit mode

The first step of Trinity is to convert and concatenate all your left/right input files into a huge .fasta file using fastools. It is possible that Trinity needs to know whether your input files are .gz compressed or not first so it can then use the proper fastool command to concatenate the input files. You might need to indicate whether it is compressed or not with a fastsq.gz file extension.

ADD REPLY
0
Entering edit mode

You also seem to have omitted some parts of the error message that you had pasted in the post above. Perhaps a vital clue is missing in that text. Can you edit that post and re-paste the error output?

ADD REPLY
0
Entering edit mode

unfortunately, the error message exceeds the character limit.

I ended up terminating the run and started a new run. I am now getting a new error message:

Thursday, April 28, 2016: 11:15:27 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR3V54/R3V54pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR3V54/R3V54pairedoutput30_1.readcount Thursday, April 28, 2016: 11:15:41 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V14/R4V14pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V14/R4V14pairedoutput30_1.readcount Thursday, April 28, 2016: 11:16:25 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V23/R4V23pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V23/R4V23pairedoutput30_1.readcount Thursday, April 28, 2016: 11:17:18 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V25/R4V25pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V25/R4V25pairedoutput30_1.readcount Thursday, April 28, 2016: 11:17:57 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V54/R4V54pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V54/R4V54pairedoutput30_1.readcount Thursday, April 28, 2016: 11:18:50 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V58/R4V58pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V58/R4V58pairedoutput30_1.readcount Thursday, April 28, 2016: 11:19:48 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V6/R4V6pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V6/R4V6pairedoutput30_1.readcount Thursday, April 28, 2016: 11:20:43 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR4V9/R4V9pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR4V9/R4V9pairedoutput30_1.readcount Thursday, April 28, 2016: 11:21:34 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V19/R5V19pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V19/R5V19pairedoutput30_1.readcount Thursday, April 28, 2016: 11:22:26 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V28/R5V28pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V28/R5V28pairedoutput30_1.readcount Thursday, April 28, 2016: 11:23:10 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V33/R5V33pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V33/R5V33pairedoutput30_1.readcount Thursday, April 28, 2016: 11:23:58 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V4/R5V4pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V4/R5V4pairedoutput30_1.readcount Thursday, April 28, 2016: 11:25:00 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V54/R5V54pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V54/R5V54pairedoutput30_1.readcount Thursday, April 28, 2016: 11:26:07 CMD: /usr/local/bin/trinityrnaseq-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /home/richardsonlab/AMMA_transcripts/30trimmoR5V8/R5V8pairedoutput30_1 >> left.fa 2> /home/richardsonlab/AMMA_transcripts/30trimmoR5V8/R5V8pairedoutput30_1.readcount Use of uninitialized value in array dereference at /usr/local/bin/trinityrnaseq-2.0.6/Trinity line 1212. Trinity run failed. Must investigate error above.

unfortunately, i can't see the top part of this error message since I cannot scroll up while in screen mode.

ADD REPLY
0
Entering edit mode

Run the trinity command like this. It will capture stdout and stderr messages (stuff that scrolls off screen) to files. Then you can show us the relevant errors from those files.

  $  trinity commands  > log_file 2> err_file
ADD REPLY
0
Entering edit mode

Thanks, do you know where i can find those error files?

ADD REPLY
0
Entering edit mode

They should be in the directory from where you ran the trinity command.

ADD REPLY

Login before adding your answer.

Traffic: 2799 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6