Question

Interpreting trim_galore report

1

Entering edit mode

7.3 years ago

vaslanzadeh ▴ 20

Hello, I have a single-end small RNA-seq data and to filter out adaptors and trim low quality bases I used Trim Galore. I am new to RNAseq data analysis and have a question about trim_galore report. I understand some part of the report except "Total written (filtered):". What does it mean? Is it number of the bases with good quality which passed the filtering(18.6%)? If so, does it mean 82.4% of the bases had low quality and were trimmed off!? At the end, trim_galore removes 75.7% of the reads. Thanks

This is cutadapt 1.13+1.g6b2366d with Python 2.7.10
Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a TGGAATTCTCGG BacRNA_sequence.fq.gz
Trimming 1 adapter with at most 10.0% errors in single-end mode ...


Total reads processed:               3,876,983

Reads with adapters:                 3,668,488 (94.6%)

Reads written (passing filters):     3,876,983 (100.0%)

Total basepairs processed: 139,571,388 bp

Quality-trimmed:                      2,391,351 bp (1.7%)

Total written (filtered):           26,018,981 bp (18.6%)

. . .

3876983 sequences processed in total

Sequences removed because they became shorter than the length cutoff of 16 bp: 2933098 (75.7%)

RNA-Seq • 4.1k views

ADD COMMENT • link updated 3.7 years ago by GenoMax 147k • written 7.3 years ago by vaslanzadeh ▴ 20

0

Entering edit mode

Is it number of the bases with good quality which passed the filtering (18.6%)? If so, does it mean 82.4% of the bases had low quality and were trimmed off!?

Yes, you are correct in your interpretation.

ADD REPLY • link 7.3 years ago by h.mon 35k

0

Entering edit mode

In this case, can we simply calculate the sequencing depth as :

Total written (filtered): 26,018,981 bp * 2 / genome size ?

ADD REPLY • link 3.7 years ago by Bio22 • 0