quality trimming with trimmomatic
2
2
Entering edit mode
6.6 years ago

I am using trimmomatic for quality trimming of fastq files for my project. im usig default values for paired end reads that trim the sequences at "phred 33". can anyone help me to know how to change the phred value from default 33 to 20 or any other value for paired and single reads??? the command line im using is given below:

java -jar trimmomatic-0.36.jar PE /data/memo/SRR9590_1.fastq \
                    /data/memona/SRR9590_2.fastq SRR959590_A_1P.fq SRR9590_A_1U.fq \
                    SRR9590_A_2P.fq SRR9590_A_2U.fq \
                    ILLUMINACLIP:/data/memona/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 \
                    LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
next-gen • 13k views
ADD COMMENT
0
Entering edit mode

phred 33, is your imported file encoding format, actually. trimmomatic can not recommend you such parameters to trim, it's only a trimmer.

I recommend you 123Fastq which combine FASTQC and trimmomatic in a highly interactive graphical user interface. 123Fastq can suggest you recommendation to trim based on your QC results. it also added some improvements to QC modules of FASTQC, added a Kmer-based approach to remove adapters in the trimming, and many other features. try it your own: https://sourceforge.net/projects/project-123ngs/

ADD REPLY
0
Entering edit mode

@blooming.daisy333

The Quality (Phred) scores, is published here :

https://drive5.com/usearch/manual/quality_score.html

So a Q3, means you will accept minimum 36 of quality score per base.

You can follow the table to setup you costume value.

ADD REPLY
2
Entering edit mode
6.6 years ago

Hello daisy,

please read again the manual carefully, especially what the meaning of the single parameters is.

With the command you provide you don't trim at a quality value lower 33. You do:

Remove adapters (ILLUMINACLIP:TruSeq3-PE.fa:2:30:10)

Remove leading low quality or N bases (below quality 3) (LEADING:3)

Remove trailing low quality or N bases (below quality 3) (TRAILING:3)

Scan the read with a 4-base wide sliding window, cutting when the average quality per base drops below 15 (SLIDINGWINDOW:4:15)

Drop reads below the 36 bases long (MINLEN:36)

fin swimmer

ADD COMMENT
0
Entering edit mode

Dear Fin Swimmer, when i processed the reads it has been mentioned in the output of terminal that the phred score is 33 (not phred encoding). im unable to figure out from manual how to adjust the value of phred. any kind help plz???

ADD REPLY
0
Entering edit mode

Please post the complete output of trimmomatic.

fin swimmer

ADD REPLY
0
Entering edit mode

Dear Swimmer, here is the output of trimmomatic

[memona@farooq Trimmomatic-0.36]$ java -jar trimmomatic-0.36.jar PE /data/memona/SRR959590_1.fastq /data/memona/SRR959590_2.fastq SRR959590_B_1P.fq SRR959590_B_1U.fq SRR959590_B_2P.fp SRR959590_B_2U.fq ILLUMINACLIP:/data/memona/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:3:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
TrimmomaticPE: Started with arguments:
 /data/memona/SRR959590_1.fastq /data/memona/SRR959590_2.fastq SRR959590_B_1P.fq SRR959590_B_1U.fq SRR959590_B_2P.fp SRR959590_B_2U.fq ILLUMINACLIP:/data/memona/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:3:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Quality encoding detected as phred33
Input Read Pairs: 20564050 Both Surviving: 12315568 (59.89%) Forward Only Surviving: 5069479 (24.65%) Reverse Only Surviving: 100165 (0.49%) Dropped: 3078838 (14.97%)
TrimmomaticPE: Completed successfully

i just have noticed that its phred encoding not the phred quality. Can you please guid me how to include the phred quality values in command???\ thanks

ADD REPLY
1
Entering edit mode

Please refer to the manual.

LEADING, TRAILING and SLIDINGWINDOW are your friends.

And beside this, we could start a discussion whether quality trimming is realy neccessary. My opinion is, if your overall quality is fine, than there is no reason to throw away informations. And even if it's bad, I would first try an error correction with a tool like clumpify or bbmerge from the BBTools.

fin swimmer

ADD REPLY
0
Entering edit mode

Hi finswimmer,

I agree with what you write, but please take into account how you write it, and how it may get interpreted by other users, especially those who are not native English speakers. Your post was edited to remove a part which may come across as condescending. We like to keep biostars a friendly place and are happy with your contributions!

Cheers,
Wouter

ADD REPLY
0
Entering edit mode

Hi WouterDeCoster,

I'm sorry. It wasn't meant condescending (had to look at the dictionary first what this mean ;)) in any way. It should be more a prompt to be a little more proactive as I already pointed to the manual (which is good to read) and pasted the most important part.

fin swimmer

ADD REPLY
1
Entering edit mode
6.6 years ago
chen ★ 2.5k

I suggest you to try another quality profiling and trimming tool fastp, which is easier to use and is 3x faster than Trimmomatic.

ADD COMMENT
1
Entering edit mode

Thanks for the recommendation, but you suggestion could use a disclaimer.

ADD REPLY
5
Entering edit mode

I'd like to clarify this a little further and tell you about how I deal with representing a (free) resource on here.

The first thing you'll see is my name "Emily Ensembl" – I've made it completely obvious that I work for Ensembl, this is my disclaimer. Everybody knows that when I recommend Ensembl for a task, it's because I work for them. I can see that you've put your resource as your location, which is almost as good, as it shows up on posts and answers (like it's done here), but it won't show up if you make any comments, so I think it is better to put your resource in your name rather than your location.

Secondly, I never recommend Ensembl when someone has already started using something else. If someone says "I want help with this UCSC thing" I will never say "You should use Ensembl instead". But if they say "I want to do this and don't know what tool to use", I think it's perfectly fine to suggest Ensembl. The only circumstance is if maybe the tool they were trying to use was completely unsuitable for the task.

Lastly, if I do recommend Ensembl, I make sure I name Ensembl. In your post you just provide a link to the tool, and it's only clear that that's the tool you work for once you've clicked on the link. If you'd have said "I suggest you to try another quality profiling and trimming tool fastp from OpenGene", along with your location at OpenGene, it would be clear why you're recommending it and nobody would mind.

Nobody here minds people promoting their tools (especially if they're good tools and free) but we like people to be open that that's what they're doing.

ADD REPLY
0
Entering edit mode

Thank you, Emily. Your suggestion is good.

ADD REPLY

Login before adding your answer.

Traffic: 2343 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6