Hello,
I am a university student currently trying to use Trimmomatic for the first time and have attempted to use a command line I have formed from using the official Trimmomatic manual and some other online resources. I am very new to Bioinformatics and especially new to programming, so I would appreciate if you can try to defer from using any jargon so I can understand your points easier. Thank you.
I am using Trimmomatic-0.36 and also undertaking this command on a remote server, rather than my local drive. The command line I have most recently used is the following:
java -jar $trimmomatic PE -phred33 /home/will/240_CTTGTA_L004_R1_001.fastq.gz /home/will/C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.TRIM.fastq.gz C240_CTTGTA_L004_R1.UNTRIM.fastq.gz C240_CTTGTA_L004_R2.TRIM.fastq.gz C240_CTTGTA_L004_R2.UNTRIM.fastq.gz ILLUMINACLIP:TruSeq-PE.fa:2:30:10
I have also tried to use the command line without including a path to my original read files, seen below:
java -jar $trimmomatic PE -phred33 240_CTTGTA_L004_R1_001.fastq.gz C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.TRIM.fastq.gz C240_CTTGTA_L004_R1.UNTRIM.fastq.gz C240_CTTGTA_L004_R2.TRIM.fastq.gz C240_CTTGTA_L004_R2.UNTRIM.fastq.gz ILLUMINACLIP:TruSeq-PE.fa:2:30:10
Lastly, I have included the path to the adapters file within the Trimmomatic-0.36 file:
java -jar $trimmomatic PE -phred33 240_CTTGTA_L004_R1_001.fastq.gz C240_CTTGTA_L004_R2_001.fastq.gz C240_CTTGTA_L004_R1.TRIM.fastq.gz C240_CTTGTA_L004_R1.UNTRIM.fastq.gz C240_CTTGTA_L004_R2.TRIM.fastq.gz C240_CTTGTA_L004_R2.UNTRIM.fastq.gz ILLUMINACLIP:/usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa:2:30:10
For all 3 of the inputs I have used, the terminal has returned this error message:
java.io.FileNotFoundException: /usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.usadellab.trimmomatic.fasta.FastaParser.parse(FastaParser.java:54)
at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:110)
at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:71)
at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:32)
at org.usadellab.trimmomatic.Trimmomatic.createTrimmers(Trimmomatic.java:59)
at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:536)
at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)
Exception in thread "main" java.io.FileNotFoundException: 240_CTTGTA_L004_R1_001.fastq.gz (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.usadellab.trimmomatic.fastq.FastqParser.parse(FastqParser.java:135)
at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:264)
at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:539
I have looked online for solutions and attempted to fix my command line, but so far nothing has worked. Any help would be greatly appreciated and would help me massively. If anymore information is needed, please do not hesitate to ask.
Many thanks,
Will.
Order of
fastq
file names is important fortrimmomatic
. Make sure that is correct.Does
ls -lh /usr/local/bin/Trimmomatic-0.36/
return a result? If the software is there then perhaps theadapters
directory is missing or may not be readable by all accounts, if an admin installed the software. If that is the case then you should ask your admins to change the read permissions.In meantime you can also re-download
trimmomatic
elsewhere and use the path to the adapters file in your command.Hi, I inputed your command and it did return a result, with the adapter folder present. Thank you for the advice. If I download it myself, will it be possible to run Trimmomatic on the command line from my local drive, while accessing the fastq files on the remote server? I always thought once I was connected to the server I wouldn't be able to access my local drive contents.
If
adapter
folder is present then is this file/usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa
not there? Based on the error above either file is not there or is not readable.You can download the adpater file to your
$HOME
(or any other directory) that you have access to on the server. At run time you will replace/usr/local/bin/Trimmomatic-0.36/adapters/TruSeq-PE.fa
with/path_to/adapter.fa
(location on the server) in your command. You can't use data that is present on your local computer from the server.Hi, I think your initial comment was correct.
I had spotted a few minor typos in my original command line, including a missing letter (C) before stating the first input read 1 file: I typed '240_CTTGTA_L004_R1_001.fastq.gz', instead of 'C240_CTTGTA_L004_R1_001.fastq.gz'.
I also found that my TruSeq adapter name wasn't fully named, as the TruSeq adapters include either TruSeq2 or TruSeq3 adapter files. I only added 'TruSeq-PE.fa', instead of 'TruSeq2/3-PE.fa'.
After correcting all of this, my latest command line is as follows:
This seemed to work and generated what I assume to be the output when Trimmomatic actually functions (see below) - however, I am not entirely sure, as I have actually never seen what a proper Trimmomatic output response looks like.
This showed all parts to work, apart from my read 1 file not being accessed (would this also be true for the read 2 file, as they are in the same directory, even though the log didn't state this?). I assume now, the only thing I can do now is wait for the admin to change my read permissions as you'd perviously mentioned? Any other issues/mistakes you may spot, please make me aware.
As always, your help is greatly appreciated.
Many thanks,
Will.
We made some progress but this is not working yet. Looks like you are reading your data from non-local directories but you are not able to write the output to the directory you are running this from (you may not have write permissions there). Make sure you include the relevant path before the output file name where you are able to write files.
Hi, thank you for the reply. I tried your path idea and it worked, with no error messages following. However, my terminal was unresponsive once I inputted the command for a long time. Was the software still processing the files? If so, when would you typically know when Trimmomatic has been properly completed? I checked the directory after running the command and all the new output files, alongside a trimlog txt file were now present.
FYI - I used Control C to escape from the unresponsive terminal, potentially cutting Trimmomatic's processing short. I then went to use FastQC on my new files; however, I got an error message stating that my files that have tried to be processed may be truncated. Again, is this the result of me maybe cutting Trimmomatic short before it ended? No pun intended!
The error log is seen below:
how big is your data? how many reads do you have. trimmomatic is fairly efficient and fast - but in the end it all depends on the amount of data
what is a "long time"? 10 minutes, 1 hour, 10 hours?
If you interrupt a program the results it produces will be unreliable
The two reads are roughly 23GB each, the only other large output file generated in my last attempt was the trimlog file, which was only 1.5GB. The time I've ran was ~30mins for my first attempt and 2 hours and on-going for my second. I also using 8 cores for my -threads flag.
It is going to take as long as it takes. You need to be patient and wait until the prompt returns. Based on speed of your CPU/disk things could be slower. It is difficult to provide an estimate.
That is not good news. It appears that your trimmed data file is incomplete since you interrupted running
trimmomatic
process. You should wait until your system prompt returns which will ensure that the process was complete. Please re-runtrommomatic
command and wait for it to complete (your system prompt becomes visible again).I have re-run Trimmomatic for a few hours now, but still no prompt from the terminal. I included '-threads 8' and so expected it to be somewhat faster. I will continue to run it, how long does it usually take (especially including 8 cores)?