Entering edit mode
3.8 years ago
cg1440
▴
60
What explains the fact that the Sequence Quality is almost entirely good across all positions, but the per tile quality is mostly bad? And what does this say about the quality of the sample with regards to the quality of the bases?
Note that the above images come from a fastqc report of a quality & adapter trimmed, SARS-CoV-2 fastq file.
Are you sure the trimming worked well? In any case if you are going to be aligning to a reference the the aligner should take care of bad quality data.
BBMap suite has a tool called
filterbytile.sh
that you can try to see if it makes a difference with improving per-tile quality. See: Introducing FilterByTile: Remove Low-Quality Reads Without Adding BiasI think so. I used trimmomatic (v 0.36), and compared before and after multiqc report of the samples. The ones that did not pass the sequence quality check before trimming passed it after trimming, so I guess trimmomatic managed to remove bad quality bases, but I have no clue why the tile quality is bad given that the sequence quality is good (this is the case with multiple samples).
Also, is it better to apply
filtebytile.sh
on the trimmed files, or on the original ones?Can you plot FastQC plot turning off compression of cycles? Perhaps you have some hidden cycles that may have problem with Q scores.
I would recommend trying
bbduk.sh
(GUIDE) orfastp
as well. You may have residual adapter bases left even after trimming. bbduk trims by read overlap.Depends on what your ultimate aim is but this data should be alignable as is.