Suggestions for Trimmomatic and "illuminaclip" parameter
1
4
Entering edit mode
5.8 years ago
Batu ▴ 290

I've been using Trimmomatic by using LEADING, TRAILING and SLIDING WINDOW parameters. Recently we have seen that ILLUMINACLIP parameter is also being used mostly. Until we have seen this, we have been thinking that Trimmomatic is built to trim Illumina adapters specifically, therefore it would detect Illumina adapters and trim them automatically. But there is nothing to indicate that it happens such way.

1) What makes Trimmomatic special about Illumina data? Accessing to adapters seems quite easy, so another tool could be used too for Illumina adapters.

2) What are your suggestions to usage of Trimmomatic? Should we always use ILLUMINACLIP parameter, or should we use it when "Overrepresented Sequences" shows the adapters in FastQC result? If it should be used, what are the default parameter values? We have seen that <adaptername>:2:30:10 is generally used, but we would like to hear your options.

RNA-Seq trimming trimmomatic illuminaclip • 17k views
ADD COMMENT
10
Entering edit mode
5.8 years ago
caggtaagtat ★ 1.9k

Hi,

I always use trimmomatic, since it already has the adapter sequences of differnt Illumina systems stored internally. Trimmomatic executes the parameters in the order they are stated, so I first use the ILLUMINACLIP:TruSeq3-SE:2:30:10 parameter to remove any remaining adapters than I use the SLIDINGWINDOW:5:20 command to remove sequence segments with a low mean quality score.

So I would always use both parameters with illumina data, since without the ILLUMINACLIP parameter, you would not remove any adapters. Afterwards there should be no "Overrepresented Sequences" marked in the FastQC output as adapters.

ADD COMMENT
0
Entering edit mode

Thanks, I will start to use ILLUMINACLIP. What about the TRAILING and MINLEN parameters? Is it suitable to use ILLUMINACLIP, TRAILING and MINLEN parameters for the high quality fastq's only? Due to having high quality scores, we have thought these parameters can be used like that. (TRAILING:30 MINLEN:50)

ADD REPLY
1
Entering edit mode

Ok, so I personally would use the SLIDINGWINDOW:5:20 parameter since it results in a more precise clipping of bases which were called with low quality. In this case, it slides over the sequences beginning at the 5'end and looks at a window of 5 bases. If the average phred score is lower 20, it cuts the reads sequence.

If you use the TRAILING:30 parameter, trimmomatic just clips all bases at the reads 3' end with a phred score lower 30. That way, the clipping would directly stop, if there is one single base with a higher phred score. The SLIDINGWINDOW parameter would overlook the single base and also take into account the following 4 bases. It is also recommended in the manual of trimmomatic to use SLIDINGWINDOW instead of TRAILING.

The MINLEN:50 argument seems fine to me, if your read length is around 100-150nt. If your are looking at splicing, however, it has been shown to be benefical to use a minimum read length of 75nt.

I'm not sure if I understood youre queston correctly, but I would of corse process all FASTQ files the same and only remove samples, if for example there is a very high amount of rRNA contamination or something like that.

ADD REPLY

Login before adding your answer.

Traffic: 2312 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6