Hi everyone, I would like to trim the beginning of all the reads in fastq file by a given length, before mapping to the genome with bowtie2. I have used Cutadapt:
cutadapt -u 48 -o output.fastq.gz input.fastq.gz
my fastq files after trimming looks like this:
gunzip -c output.fastq.gz | head
@NB502143:99:HFF7TAFX2:1:11101:4133:1019 1:N:0:ATCACG
CATGAAAAAGAGCTCATTTTCAGATGCAGGAATTCCTATCCG
+
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
@NB502143:99:HFF7TAFX2:1:11101:19790:1020 1:N:0:ATCACG
CATGATCCACTTTTCCACGCGCTTTGACGACCATTTTATAA
+
EEEEE<EEEEEEEEEEEEEEEEE<EE/EEAEEEEEEEEEEE
@NB502143:99:HFF7TAFX2:1:11101:6327:1020 1:N:0:ATCACG
CATGATCTCAGTAAAGGCATTTGTGGTTGTTAAGTAGCCATT
When I try to map it with bowtie, I get the following error message:
Saw ASCII character 10 but expected 33-based Phred qual.
I don't get this error if I map input.fastq.gz, so I suspect something wrong is happening during the trimming but I can't figure out what! Thanks for your help.
I check with Fastqc, both files are Sanger / Illumina 1.9 encoded.
Tried to run:
it didn't change anything.
However, Fastqc shows low read quality towards the end of the reads after trimming, which is not the case before trimming. So it's clear that the quality scores are been messed up, but I don't get how!