Entering edit mode
18 months ago
Marta
•
0
Hello everyone,
I'm trying to perform a de novo transcriptome using Trinity and having many issues. The last time I got the inchworm error attached.
********************************************************************
** Warning, Trinity cannot determine which version of Java is being used. Version 1.7 is required.
Attempting to continue in 30 seconds
********************************************************************
Trinity version: v2.1.1
-ERROR: couldn't run the network check to confirm latest Trinity software version.
Wednesday, May 31, 2023: 13:16:33 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /LUSTRE/home/BIO176/garrigosm/anaconda3/envs/tortolas/opt/trinity-2.1.1/util/support_scripts/ExitTest$Wednesday, May 31, 2023: 13:16:33 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /LUSTRE/home/BIO176/garrigosm/anaconda3/envs/tortolas/opt/trinity-2.1.1/util/support_scripts/ExitTest$
----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------
----------------------------------------------------------------------------------
Converting input files. (in parallel)Wednesday, May 31, 2023: 13:16:33 CMD: cat /LUSTRE/home/BIO176/garrigosm/tortolas/trinity/../reads_norrna/all_reads_R1.fastq | fastool --illumina-trin$Wednesday, May 31, 2023: 13:16:33 CMD: cat /LUSTRE/home/BIO176/garrigosm/tortolas/trinity/../reads_norrna/all_reads_R2.fastq | fastool --illumina-trinity --to-fasta >> right.fa 2> /L$-conversion of 639072274 from FQ to FA format succeeded.
-conversion of 639072274 from FQ to FA format succeeded.
Wednesday, May 31, 2023: 13:57:22 CMD: touch left.fa.ok right.fa.ok
Wednesday, May 31, 2023: 13:57:22 CMD: cat left.fa right.fa > both.fa
-------------------------------------------
----------- Jellyfish --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------
* Running CMD: jellyfish count -t 20 -m 25 -s 18178674418 --canonical both.fa
* Running CMD: jellyfish dump -L 1 mer_counts.jf > jellyfish.kmers.fa
* Running CMD: jellyfish histo -t 20 -o jellyfish.kmers.fa.histo mer_counts.jf
* Running CMD: /LUSTRE/home/BIO176/garrigosm/anaconda3/envs/tortolas/opt/trinity-2.1.1/Inchworm/bin//inchworm --kmers jellyfish.kmers.fa --run_inchworm -K 25 -L 25 --monitor 1 --DS --nu$sh: línea 1: 3962748 Terminado (killed) /LUSTRE/home/BIO176/garrigosm/anaconda3/envs/tortolas/opt/trinity-2.1.1/Inchworm/bin//inchworm --kmers jellyfish.kmers.fa --run_inchworm -K 25 $Kmer length set to: 25
Min assembly length set to: 25
Monitor turned on, set to: 1
double stranded mode set
setting number of threads to: 6
-setting parallel iworm mode.
-reading Kmer occurences...
^M [0M] Kmers parsed. ^M [1M] Kmers parsed. ^M [2M] Kmers parsed. ^M [2M] Kmers parsed. ^M [3M] Kmers parsed. ^M [3M] Kmers parsed. ^M [4M] Kmers parsed. ^M [5M$ Pipeliner::run(Pipeliner=HASH(0x55918cd0a9a0)) called at /home/BIO176/garrigosm/anaconda3/envs/tortolas/bin/Trinity line 2054
eval {...} called at /home/BIO176/garrigosm/anaconda3/envs/tortolas/bin/Trinity line 2049
main::run_inchworm("/LUSTRE/home/BIO176/garrigosm/tortolas/trinity/trinity_out_di"..., "both.fa", undef, "") called at /home/BIO176/garrigosm/anaconda3/envs/tortolas/bin/Trinity li$ main::run_Trinity() called at /home/BIO176/garrigosm/anaconda3/envs/tortolas/bin/Trinity line 1206
eval {...} called at /home/BIO176/garrigosm/anaconda3/envs/tortolas/bin/Trinity line 1205
If it indicates bad_alloc(), then Inchworm ran out of memory. You'll need to either reduce the size of your data set or run Trinity on a server with more memory available.
** The inchworm process failed.
The code I used is:
Trinity --seqType fq --left ../reads_norrna/all_reads_R1.fastq --right ../reads_norrna/all_reads_R2.fastq --CPU 20 --max_memory 350G --no_bowtie
Thank you in advance!
This likely is the issue.
trinity
requires 1GB of RAM per million paired-end reads (LINK). How much data do you have?I have around 300k reads for each file (R1 and R2), but I also tried it with only 1 sample (15472207 reads) and also get errors. Thanks!
That is 15 million reads not 300K. But even then 350G of RAM should have been plenty. Your files are complete i.e have no issues with them?
My bad, its 600 millions reads, 15-20 per sample. I meant that in this case, I run the code with all samples merged. However, using only one sample I still get errors. I checked the files before and after emerging all samples and the number of reads coincides in R1 and R2, and makes sense looking at the different fastqc I performed during the data cleaning. Files head looks like this:
can you run the same command with just one sample with the flag
--verbose
so we can actually see the error?With a sample, the error occurs in the previous step. This is the error I get with the flag --verbose
The initial
ERROR
is simply a warning so that is hopefully not the problem. What do you see when you runjava --version
?Your error appears to be similar to a prior one reported here: https://github.com/trinityrnaseq/trinityrnaseq/issues/647
Based on answers there you are likely running against storage quota or some other limit.
So with 600M reads you will need a minimum of 600GB of RAM per trinity recommendations.
If you are not able to get more memory consider normalizing data. See https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbnorm-guide/