LEADING and TRAILING in TRIMMOMATIC
3
3
Entering edit mode
6.5 years ago

Hi Biostars,

I have been using trimmomatic for quite some time, but realized that I don't get something. So one can use LEADING and TRAILING options to remove bases from the beginning and end of the read, respectively. From manual: LEADING Remove low quality bases from the beginning. As long as a base has a value below this threshold the base is removed and the next base will be investigated.

My question is: "From the beginning" means from the beginning until the end of the read? If yes, then what is the meaning of having TRAILING option if the whole read is scanned? Otherwise, until which base does trimmomatic scan by LEADING option?

Cheers,

RNA-Seq trimmomatic • 12k views
ADD COMMENT
1
Entering edit mode

@grant.hovhannisyan

The Quality (Phred) scores, is published here :

https://drive5.com/usearch/manual/quality_score.html

So a Q3, means you will accept minimum 36 of quality score per base.

You can follow the table to setup you costume value.

ADD REPLY
0
Entering edit mode

Please stop adding this answer to old threads. It does not add to this or the question you posted this to before. This question here has an accepted answer already so no need to refresh it.

ADD REPLY
3
Entering edit mode
6.5 years ago
BioinfGuru ★ 2.1k

According to the documentation both both options take the argument "quality":

LEADING:quality

leading: Cut bases off the start of a read, if below a threshold quality

quality: Specifies the minimum quality required to keep a base. Remove low quality or N bases.

Also from the manual: LEADING - Remove low quality bases from the beginning. As long as a base has a value below this threshold the base is removed and the next base will be investigated.

"LEADING 3" would delete all bases below a quality threshold of 3 or that are N, beginning at the first base and continuing until the first base that is at least a quality of 3 and is not N

AAAGGGTTT 012345678 - Leading 3 would cause the deletion of AAA

AAANNNTTT 012345678 - Leading 3 would cause the deletion of AAANNN

AAAGGGTTT 123456789 - Leading 3 would cause the deletion of AA

Same happens with trailing but from the other end

ADD COMMENT
0
Entering edit mode

Thanks, now it make sense to me.

ADD REPLY
1
Entering edit mode
6.5 years ago
Expe ▴ 10

I think that with the leading option you only remove a few nucleotides at the beginning of the reads and keep the rest. The trailing option does the opposite: it removes the nucleotides at the end of the read. This site and this site explain why nucleotides should be removed from the beginning and from the end of the reads (you keep the ones in the middle then).

ADD COMMENT
1
Entering edit mode
5.4 years ago
drsami ▴ 90

I knew my answer is a bit late, but I could help newcomers, the only difference between having Leading and Trailing Trimming is as follows:

1 - Leading Trimming: starts trimming from 5' end of the read (starting from left), The trimmer keeps each base until the quality reaches below the threshold, This type of trimming always leads to shorter reads in length.

2 - Trailing Trimming: starts trimming from 3' end of the read (Starting from right), The trimmer cuts until it reaches to a sliding window with good enough quality which equals to the threshold given, this type of trimming always leads to longer reads in length.

I hope that helps.

ADD COMMENT

Login before adding your answer.

Traffic: 1606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6