Hello, I have a single-end small RNA-seq data and to filter out adaptors and trim low quality bases I used Trim Galore. I am new to RNAseq data analysis and have a question about trim_galore report. I understand some part of the report except "Total written (filtered):". What does it mean? Is it number of the bases with good quality which passed the filtering(18.6%)? If so, does it mean 82.4% of the bases had low quality and were trimmed off!? At the end, trim_galore removes 75.7% of the reads. Thanks
This is cutadapt 1.13+1.g6b2366d with Python 2.7.10
Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a TGGAATTCTCGG BacRNA_sequence.fq.gz
Trimming 1 adapter with at most 10.0% errors in single-end mode ...
Total reads processed: 3,876,983
Reads with adapters: 3,668,488 (94.6%)
Reads written (passing filters): 3,876,983 (100.0%)
Total basepairs processed: 139,571,388 bp
Quality-trimmed: 2,391,351 bp (1.7%)
Total written (filtered): 26,018,981 bp (18.6%)
. . .
3876983 sequences processed in total
Sequences removed because they became shorter than the length cutoff of 16 bp: 2933098 (75.7%)
Yes, you are correct in your interpretation.
In this case, can we simply calculate the sequencing depth as :
Total written (filtered): 26,018,981 bp * 2 / genome size ?