I recently ran Trimmomatic PE with the following thresholds:
java -jar trimmomatic-0.36.jar PE f1 f2 f1_paired f1_unpaired f2_paired f2_unpaired ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
I didn’t specify –phred, since I saw a message from Trimmomatic:
Quality encoding detected as phred33
So I think Trimmomatic (v 0.36) just uses –phred33 automatically since it detected phred33 in our fastq.
I got the following survival rates:
Input Read Pairs: 66780154 Both Surviving: 62036296 (92.90%) Forward Only Surviving: 2387288 (3.57%) Reverse Only Surviving: 826849 (1.24%) Dropped: 1529721 (2.29%) TrimmomaticPE: Completed successfully
Now when I ran FastQC on the output forward_paired and reverse_paired fastq files, I got “red-cross” on “Per base sequence content”, “Sequence Duplication Levels”, and “Kmer Content”.
So I am wondering how I should adjust the thresholds in running Trimmomatic in order to improve FastQC reports?
I am considering using LEADING:5 and TRAILING:5 (or LEADING:10 and TRAILING:10, would this be too high for phred33?). I am not sure how much this change on LEADING and TRAILING could improve the quality though. Should I also increase the threshold for SLIDINGWINDOW from 15 to 17 (or even 20)?
Any suggestions and advice would be greatly appreciated.
Thank you very much!