I have a FASTQ* file with reads from an Illumina machine and try to do the quality control filtering with the FASTX-Toolkit but get problems with the quality scores (see this post) for a nice discussion about the scores)
While fastx_quality_stats
and fastx_trimmer
run without complaining, 'fastq_quality_filter' is suddenly not happy with the files
fastq_quality_filter: Error: invalid quality score data on line 148 (quality_tok = "Z]aaaaa]O]aabaaaaa]" Petra*read2.fasta
The particular read looks like this :
@Petra_4_1_1_10_1327/1
AGTATTTTTGAATCTCATCATCGTCACTTCACTAAG
+Petra_4_1_1_10_1327/1
`Z]aaaaa]O]aabaaaaa`][FW`__a`\FW_X[M
Does anyone have a suggestion (other than deleting this read) or some experience?
*well it is labeled .FASTA but looks like a FASTQ file
This may be a stupid question, but are you sure about which encoding is being used for quality scores? I'd assume the latest Illumina encoding, but can you be sure?