Heys,
I'm working with whole-genome sequencing data of a non-model organism. I ran the FastQC for the quality control of the data and I found faliures both in adapter content and tile's quality. Adapter content is not an issue anymore, but I'm struggling with per tile sequence quality. I tried to filter it with FastP and with filterbytile.sh script from BBMAP independently but both did not manage. I attach figures of one raw PE genome (with filterbytile script some of the red tiles disapear but still fails):
Forward:
Reverse: picture 2
I know there are not a lot of tiles with red colours, but I just wanted to be sure if I should try to filter them or keep them like this.
Thanks!
Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image or the HTML embed code, not the link to the webpage that has the image embedded (which is what you have used here)
is it okey now? thanks for the message!
It is normal to see a few tiles light up like this. If there is consistent bad data, it would be handled by your aligner or Q-score based pre-filtering, if you were doing any
de novo
work.thanks, did you also check this one? Is not updating properly in the first message:
It is up to you. Are you going to align to a good reference? If yes, I would say go ahead. If you are planning to do de novo work then you could try to eliminate these with filterbytile.
I tried to filter them with filterbytile but it did not improve the results... Luckily I have the reference genome, thanks for your help! :)