Hey everyone, and before anything else, thanks for the help!
I started working on a new project with low coverage dna seq. The project use .bam files and mpileup using samtools. After a lot of readings, i found out that the data uses sanger FASTQ mapping quality. Notice: i work on readings with no ref bases, if it changes anything.
I am trying to understand if there will allways be a reading map quality after the sign of a new read, which is to say: will it allways be in the format of '^XB' where X stands for the read quality and B stands for the base in the read?
Thanks again for the help, and if there is an older post about it, i will be happy if you can give me a link.
For example, my pileups look like that:
Chr_name pos ref 3 GG^]G etc...
Hello saar.armon and welcome to biostars,
could you please explain how the mpileup was created? There should be 6 columns and not just 5 (see here and here).
The qualities given in a mpileup are not mapping qualities (saying how good the entire read maps over the whole read length) instead they are base qualities (estimated probability that the base that was called at this position in the given read is not wrong).
fin swimmer
The bit after the
^
in the sequence column is the mapping quality. You're referring to the next column which contains base qualities.Note that that the mapping quality may have been modified by the BAQ algorithm (see
mpileup -B
option).