NEW beginner
I have run some samples on Miseq. After a quality check by FastQC, some of my samples had error with per tile sequence quality at a specific position. If I am not wrong, I think it was caused by the preparation of flowcell becase the samples with this kind of error were shown at the same specific position.
My questions are...
- Please give me some explanations; what is the cause of this error?
- How to solve this problem or how to improve the quality of my sequences?
- Can I ignore this error?
- I have heard that FilterByTile might can help me on this, but I do not know how to use it. Please give me some suggestions
Thank you in advance.
I would not worry too much in this case since you had poor quality only in one tile, while the rest of the flowcell looks good. This should not affect downstream analysis. However, it would be good to have a look also at the plot "per base sequence quality". Can you upload it?
Sure! Here it is.
Marco Pannone Could you please tell me how to interprete thes two plots? Thank you.
You attached the plot under the section "per sequence quality scores", while I asked for the plot in "per base sequence quality".
I'm very sorry. You can now find it below. .
You have not very good base quality towards the 3'-end of your reads, which is very common in sequencing, especially if you have quite long reads (like in your case). I would suggest you perform some trimming using a Phred score cutoff (<30 should be appropriate) in order to eliminate all low-quality reads. You can use any popular trimming tool, such as
trimgalore
(https://github.com/FelixKrueger/TrimGalore/blob/master/Docs/Trim_Galore_User_Guide.md#step-1-quality-trimming) orcutadapt
(https://cutadapt.readthedocs.io/en/stable/guide.html). Look through the documentation and you should be able to make the appropriate command line by yourself. Does the FastQC reports issues in other fields too (overrepresented sequences, adapter content, etc.)?Thank you for your suggestion.
Actually, these above plots are from the trimmed read with Q20 (99%). After trimming, there are only two issues remaining. Another one is sequence length distribution (you can see it below).
In addition, there were many issues reported by FastQC before trimming.
However, this is the report with Q30 as you recommended. The same problems still exist.