Hi,
I have been looking at a library run in Hiseq using fastQC. Usually I get good Per tile sequence quality, with the entire area looking bluish (OK quality throughout). But this time I get tiles that appear to give consistently low quality reads over the entire read sequence. The thing is - I thought this was due to a problem in the flow cells, but I see the same cells appearing to have bad reads over several different runs. The strangest bit - the same experiment was run on two different occasions; it was too large to sequence in one illumina run so they divided it into two groups. Group no.1 looks bad (the same flow cells give low quality reads at the exact same places) but group 2 - the libraries from the same experiment look bad while the quality of the rest of the libraries look great...
I checked other runs - some looked great, others had the same problem... so whatever it is it's not consistently happening at every run.
I found a webpage describing this - please look at the second figure from the top, I see something like that, just worse: https://sequencing.qcfail.com/articles/position-specific-failures-of-flowcells/
Anyone knows what this means?
If you are aligning to a reference I would not worry about the per tile quality. It is possible that you are sequencing a low nucleotide complexity region (does this run have phiX spiked-in?) and that may be resulting in the bad tile quality.
If you are still interested in addressing this then BBMap has a tool called
filterbytile.sh
that you can use: Introducing FilterByTile: Remove Low-Quality Reads Without Adding BiasThere is phiX, but it can't be the reason as all the runs have phiX... Also, the quality is clearly affected as the errors bars of the reads are very, very long
You could add some images using these directions: How to add images to a Biostars post so we can get visual info.
Hopefully your sequence provider does due diligence. If there was a problem during the run then they should not have released the data. Have you tried aligning the data as is (after some scanning/trimming to remove adapters)? Does that indicate a problem? You can also give FilterByTile a try to see if that is able to improve the situation.
I do think there was a sequencing problem... I will try and clean it using the tools suggested Thanks
If you feel that data is not satisfactory you should be able to talk with your sequence provider and express your concerns. Failures (hardware/software) during sequencing are compensated by Illumina at no cost to the provider (as long as they have a maintenance contract, which they will if they do sequence commercially).