Question

What is a good Trimmomatic parameter?

1

Entering edit mode

6.5 years ago

Arindam Ghosh ▴ 530

I am trying to clean RNAseq reads with TRIMMOMATIC using the parameters as:

java -jar trimmomatic-0.36.jar PE -trimlog file_trim_log input_1.fastq.gz input_2.fastq.gz output_1P_clean.fq.gz output_1U_clean.fq.gz output_2P_clean.fq.gz output_2U_clean.fq.gz ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 SLIDINGWINDOW:10:30 LEADING:28 TRAILING:28 MINLEN:75

ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 : My reads contains TruSeq Universal adapters as indicated by FASTQC.

SLIDINGWINDOW:10:30 : As per manual

LEADING:28 : Willing to remove bases whose quality falls below 28 in FASTQC per base quality module

TRAILING:28 : Willing to remove bases whose quality falls below 28 in FASTQC per base quality module

MINLEN:75 : Minimum length should not fall below 75bp

Are the parameters reasonable or too strict?

For Leading and Trailing the quality score in several manuals were given 3. Isn't that too low?

I am a bit concerned because only 70-80% of the paired reads survived. The rest either fall of Forward/Reverse Only Surviving or Dropped.

trimmomatic RNA-Seq ngs sequencing • 10k views

ADD COMMENT • link updated 5.2 years ago by MiladAD ▴ 10 • written 6.5 years ago by Arindam Ghosh ▴ 530

1

Entering edit mode

in the RNAseq data analysis, You have to be careful to strike a balance between acceptable quality and also minimize the number of discarded reads. it should be noted, all the adapters contamination should be trim. I recommend you 123Fastq which combine FASTQC and trimmomatic in a highly interactive graphical user interface. it also added some improvements to QC modules of FASTQC, added a Kmer-based approach to remove adapters in the trimming, and many other features. try it your own: https://sourceforge.net/projects/project-123ngs/

ADD REPLY • link 5.2 years ago by MiladAD ▴ 10

score 5 · Answer 1 · 2018-05-16

This is totally up to you ! It's more a biological question.

AFAIK trimming reads in RNAseq is not a mandatory step anymore. People say that if your reads are not good enought to be aligned they will not. (If you want to do an alignment after that of course)

But if you really want to trim these reads, you will have to check the 5' and 3' quality of your data using tools like FastQC

If you have a really low quality on the first 5 bases (let's say), you can trim them, but if you have good quality on these bases, why trim them ?

Here, the good post you need : Trimming single end reads for STAR?

And the publication : https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4766705/