I have been trying to understand how to calculate probabilities of correct calls in my Illumina dna sequencing results coming off of both the nextseq and the hiseq 4000. My understanding is that these can use different formats like illumina 1.7 or 1.8 and that the quality scores will translate to a phred score differently.
So, is there a simple ascii conversion that can be used to convert the illumina quality scores into a phred score which can then be used with a formula like Q=-10log10P to compute the probability that the base call is correct?
Just so you know, you still have to know which version was used, else you'll be doing +64 to everything not +33. Perhaps the nextseq/hiseq 4000 are such new machines they never even came with anything <1.8
They are - Illumina's software for those platforms is ASCII-33 exclusively.
Ah awesome :)