Hi All, I am running this assembly using Trinity feeding the following command:
Trinity --seqType fq --max_memory 100G --right file_1.fastq
--left file_2.fastq --CPU 16 --output SRR-trinity-output
However, it keeps generating this error shown below. Can someone please help me resolve this issue here. Thanks.
Error I am getting:
Wednesday, November 8, 2017: 18:40:29 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/support_scripts/ExitTester.jar 0
Wednesday, November 8, 2017: 18:40:29 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/support_scripts/ExitTester.jar 1
----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------
----------------------------------------------------------------------------------
---------------------------------------------------------------
------------ In silico Read Normalization ---------------------
-- (Removing Excess Reads Beyond 50 Coverage --
---------------------------------------------------------------
# running normalization on reads: $VAR1 = [
[
'/media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_2.fastq'
],
[
'/media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_1.fastq'
]
];
Wednesday, November 8, 2017: 18:40:29 CMD: /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/insilico_read_normalization.pl --seqType fq --JM 100G --max_cov 50 --CPU 16 --output /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file-trinity-output/insilico_read_normalization --max_pct_stdev 10000 --left /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_2.fastq --right /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_1.fastq --pairs_together --PARALLEL_STATS
Converting input files. (both directions in parallel)CMD: seqtk-trinity seq -A /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_2.fastq >> left.fa
Error, not recognizing read name formatting: [file.1]
If your data come from SRA, be sure to dump the fastq file like so:
SRA_TOOLKIT/fastq-dump --defline-seq '@$sn[_$rn]/$ri' --split-files file.sra
Thread 1 terminated abnormally: Error, cmd: seqtk-trinity seq -A /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_2.fastq >> left.fa died with ret 512 at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/insilico_read_normalization.pl line 758.
CMD: seqtk-trinity seq -A /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_1.fastq >> right.fa
Error, not recognizing read name formatting: [file.1]
If your data come from SRA, be sure to dump the fastq file like so:
SRA_TOOLKIT/fastq-dump --defline-seq '@$sn[_$rn]/$ri' --split-files file.sra
Thread 2 terminated abnormally: Error, cmd: seqtk-trinity seq -A /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_1.fastq >> right.fa died with ret 512 at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/insilico_read_normalization.pl line 758.
Error, conversion thread failed at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/insilico_read_normalization.pl line 329.
Error, cmd: /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1/util/insilico_read_normalization.pl --seqType fq --JM 100G --max_cov 50 --CPU 16 --output /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file-trinity-output/insilico_read_normalization --max_pct_stdev 10000 --left /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_2.fastq --right /media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test-plantae/pan_sra/SRX/test/file_1.fastq --pairs_together --PARALLEL_STATS died with ret 7424 at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1//Trinity line 2544.
main::process_cmd('/home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v...') called at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1//Trinity line 3090
main::normalize('/media/owner/b54f3251-5380-4288-9ddf-fa3357ea8294/test...', 50, 'ARRAY(0xd58eb8)', 'ARRAY(0xd64760)') called at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1//Trinity line 3037
main::run_normalization(50, 'ARRAY(0xd58eb8)', 'ARRAY(0xd64760)') called at /home/owner/Desktop/pan/executables/trinityrnaseq-Trinity-v2.5.1//Trinity line 1297
Looks like it could be a problem with the FASTQ formatting. I've seen the error reported a few times across the WWW but no solution. Take a look here in order to validate your FASTQ files: Fastq Quality Read And Score Length Check
Edit: just noticed that you're also specifying file1 as right reads and file2 as left. Did you mean to do that?
Thanks. Yes, the two files are part of paired reads from
fastq-dump --defline-seq '@$sn[_$rn]/$ri' --split-files *.sra
option generating left and right fastq files.Additionally, it is generating the same error for the test fastq files from the Trinity package itself. I tried
make test_trinity
and it is generating the same error.I can't be sure but it looks like a bug in the program. There appears to be a solution here: https://groups.google.com/forum/#!topic/trinityrnaseq-users/Mo4hrTo5dSM
Thanks for posting the answer below. No doubt others will encounter the same problem