Hi all! I have problem with Trimmomatic 0.33
I have paired-end Illumina reads. For experimenting, I used just 1 read pair, but it comes from a real dataset. So this is my R1.fastq
file:
@NS500448:3:H058VAFXX:1:11101:9071:4665 1:N:0:CGTACTAG+ATAGAGAG
TCTATCAACAGAGAAAGTTACGCAAAGAAAATGAGTCAAAAGTCTCAAAAAAAGAAGAGTGCTAGGACTGTCTCTTATACACACCGAGCCCACGAGACCGT
+
AAAAAFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFAFFFFFFFFFFFFFFFFFFFFFFFAFFFFFF7FFFFFF<FFFFFF<FFF<FF.FFFFFFFFFFF
And my R2.fastq
file:
@NS500448:3:H058VAFXX:1:11101:9071:4665 2:N:0:CGTACTAG+ATAGAGAG
TCCTAGCGCTCTTCTTTTTTTGAGACTTTTGACTCATTTTCTTTGCGTAACTTTCTCTGTTGATAGACTGTCTCTTATACGCATATGACGCTGCCGACG
+
A.<<.F....A).A..<AAF.F)F.<...<F.......AA...<F7FA<..<.A)..FFF.F.<.F.<.F<<.<<<AA7..7....F.7F<7F<.F..F
This is a pair of reads with a nice overlap, both have a piece of adapter on the 3' end. It can be seen from the alignment of R1 and the reverse complement of R2:
EMBOSS_001 1 --------------------------------TCTATCAACAGAGAAAGT 18
||||||||||||||||||
EMBOSS_001 1 CGTCGGCAGCGTCATATGCGTATAAGAGACAGTCTATCAACAGAGAAAGT 50
EMBOSS_001 19 TACGCAAAGAAAATGAGTCAAAAGTCTCAAAAAAAGAAGAGTGCTAGGAC 68
|||||||||||||||||||||||||||||||||||||||||.|||||||
EMBOSS_001 51 TACGCAAAGAAAATGAGTCAAAAGTCTCAAAAAAAGAAGAGCGCTAGGA- 99
EMBOSS_001 69 TGTCTCTTATACACACCGAGCCCACGAGACCGT 101
EMBOSS_001 100 --------------------------------- 99
This is my file with adapters adapters.fa
:
>Prefix_N702+E502/1
CTGTCTCTTATACACATCTCCGAGCCCACGAGACCGTACTAGATCTCGTATGCCGTCTTCTGCTTG
>Prefix_N702+E502/2
CTGTCTCTTATACACATCTGACGCTGCCGACGAATAGAGAGGTGTAGATCTCGGTGGTCGCCGTATCATT
Now when I run Trimmomtaic:
java -jar ~/Trimmomatic-0.33/trimmomatic-0.33.jar PE -phred33 R1.fastq R2.fastq out/R1_pair.fastq out/R1_single.fastq out/R2_pair.fastq out/R2_single.fastq ILLUMINACLIP:adapters.fa:2:30:15
it fails to clip the adapters off. Reads end up unchanged in R1_pair.fastq
and R2_pair.fastq
I tried to change pretty much anything I could - threshold values, what is /1
and /2
in adapters.fa (I even tried reverse complements here) and the result is always the same. Interestingly, the only thing that had any effect was to use -phred64
instead of phred33, though I'm sure these files are actually phred 33. When I do so, it trims R1 correctly and discards R2. Though this is the best result I achieved so far, it is of no use, because I want to keep both reads from such pairs and just trim them, and, especially, because my files actually really are phred 33, so when I add any quality-based trimming, Trimmomatic just discards all the reads.
So...does anyone know what am I doing wrong? Thanks.
That's cool!
I see. So Trimmomatic does not deal with gaps? But anyway, the R2 adapter aligns well, no gaps, just 2 missmatches:
Or do they both have to align without gaps?
There are some parameters there that specify the alignment quality and - that being said the trimmomatic palindromic mode is very confusing both as their description as well as in the implementation - I found that I have to reread the paper to understand it then I forget it again - very counterintuitive the way this gets specified as well.
I would not worry about the palindromic thing and just trim the adaptors normally, also note that the palindromic trim will create single end reads from the paired end reads as it drops one of the pairs - that just adds to the complexity. I would use the simple trimming functionality.
OK, thanks for your advice. It is good to see that I'm not the only one who is confused.