Question

Unexpected Bowtie2 behavior with --phred64 and --phred33 flags

0

Entering edit mode

5.9 years ago

Denis ▴ 310

I'm stiil not able to figure out how Bowtie2 deal with different Phred quality encoding. In my recent post: enter link description here

genomax pointed me to use --phred64 flag when using Bowtie2 to map Phred+64 encoded reads. I tried both --phred64 and --phred33 flags to map the same reads to reference. All my reads have the same quality score in each position (just short example):

@Seq=TGCAGCTTGCTGGTCATTGAGTGGAGCTGGGTCTTAGCCTTGAGATGGA
TGCAGCTTGCTGGTCATTGAGTGGAGCTGGGTCTTAGCCTTGAGATGGA
+
fffffffffffffffffffffffffffffffffffffffffffffffff
@Seq=TGCAGCTTGCTGGTCATTGAGTGGAGCTGGGTCTTAGCCTTGAGGTGGA
TGCAGCTTGCTGGTCATTGAGTGGAGCTGGGTCTTAGCCTTGAGGTGGA
+
fffffffffffffffffffffffffffffffffffffffffffffffff

Obiously the reads are Phred+64 encoded. My first command line was:

bowtie2  --very-sensitive --phred64 -x index -U ForAlign.fa.gz -S AlignFullvs_test_64.sam

In the log file i've found:

249631 reads; of these:
  249631 (100.00%) were unpaired; of these:
    6787 (2.72%) aligned 0 times
    199271 (79.83%) aligned exactly 1 time
    43573 (17.45%) aligned >1 times
97.28% overall alignment rate

Further, i used the same command but changed --phred64 to --phred33 flag:

bowtie2  --very-sensitive --phred33 -x index -U ForAlign.fa.gz -S AlignFullvs_test_33.sam

In the log file:

249631 reads; of these:
  249631 (100.00%) were unpaired; of these:
    7132 (2.86%) aligned 0 times
    200810 (80.44%) aligned exactly 1 time
    41689 (16.70%) aligned >1 times
97.14% overall alignment rate

So, as you can see, the overall statistics was almost the same, there were not any errors or at least warnings in both runs. My questions are:

How Bowtie2 can map Phred+64 encoded reads with --phred33 flag without any errors?
1. Are the both resulted SAM files correct?

software error alignment • 3.3k views

ADD COMMENT • link updated 3.2 years ago by Devon Ryan 105k • written 5.9 years ago by Denis ▴ 310

1

Entering edit mode

You can also change the Q scores to phred+33 by using the following program from BBMap suite before mapping:

reformat.sh in=your.fq.gz out=new.fq.gz qin=64 qout=33

ADD REPLY • link 5.9 years ago by GenoMax 150k

score 2 · Accepted Answer · 2019-05-26

2

Entering edit mode

5.9 years ago

Devon Ryan 105k

The phred scores are only ever used for computing alignment scores in cases where there are mismatches. The alignment score itself will be wrong, but that won't change where the actual resulting alignments are found.
Base qualities in SAM files are defined to be Phred+33, so if you tell bowtie2 the wrong thing the resulting SAM/BAM files will technically be malformed. Whether this is actually a problem is use-case dependent (this is likely only an issue for variant calling).

ADD COMMENT • link 5.9 years ago by Devon Ryan 105k

0

Entering edit mode

Hi Devon Ryan! Many thanks for your reply, it makes things clear. Let me ask just one more question please? I'm wondering how single mismatch with quality "f" (ASCII 102) will be scored in Phred+33 mode?

ADD REPLY • link 5.9 years ago by Denis ▴ 310

1

Entering edit mode

It's a mismatch at a high quality base, so it'd be the maximum penalty. This would be the case even if Phred+64 had been used.

ADD REPLY • link 5.9 years ago by Devon Ryan 105k

0

Entering edit mode

Could you please explain me, why it would be scored by bowtie2 as a high quality base in Phred+33 mode since there is no f character in Phred+33 encoding?

ADD REPLY • link 3.2 years ago by Denis ▴ 310

1

Entering edit mode

What makes you think there's no f in Phred+33. That would be a score of 69 or so. Of course no machine produces that, but bowtie2 has no way of knowing that, it's just looking up what f is in ascii (102) and subtracting 33 from it. Tools aren't going to do any sanity checking on the results, except maybe to ensure that they're not negative.

ADD REPLY • link 3.2 years ago by Devon Ryan 105k

0

Entering edit mode

Thank you so much for the detailed explanation! But the penalty for such mismatch would be even greater than the maximum penalty. Am i right? May Bowtie2 handle that?