It will do that by default if you don't output in BCF format (that is, omit -b
). It will give you:
chr10 250 N 85 A$A$A$a$AaaAAaaaaTaaaaaAaAaaaAAAaaaAaAaaaaaaaaAAaaaaaAAaAaaaaaaaAaaaaaAAaAaaAaaaaaaAaAaa^!a hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
chr10 251 N 85 A$a$a$AAaaaaAaaaaaAaAaaaAAAaaaAaAaaaaaaaaAAaaaaaAAaAaaaaaaaAaaaaaAAaAaaAaaaaaaAaAgaa^!A^!a^!A^!a hhhhhhhhhdhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
chr10 252 N 85 G$G$g$g$ggGgggggGgGgggGGGgggGgGggggggggGGgggggGGgGgggggggGgggggGGgGggGggggggGgGgggGgGg^!g^!g^!g hhhhhhhhhhhhhhhhhhhhKhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhghhhhhhhhhhhhhhXhhhhhhhhhhhJhXh
chr10 253 N 83 c$c$CcccccCcCcccCCCcccCcCccccccccCCcccccCCcCcccccccCcccccCCcCccCccccccCcCcccCcCcccc^!C^!c hhghhhhhhhhhhhhhHhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
chr10 254 N 84 T$t$t$tttTtTtttTTTtttTtTttttttttTTtttttTTtTtttttttTtttttTTtTttTttttttTtTtttTtTttttTt^!T^!T^!t hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
chr10 255 N 83 a$a$aAaAaaaAAAaaaAaAaaaaaaaaAAaaaaaAAaAaaaaaaaAaaaaaAAaAaaAaaaaaaAaAaaaAaAaaaaAaAAa^!a^!a hhhhhhhhhhh>hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
And if you specific the reference (with -f
) it will replace matching bases with .
or ,
, depending on the strand:
chr10 250 A 84 .$.$.$,$.,,..,,,,T,,,,,.,.,,,...,,,.,.,,,,,,,,..,,,,,..,.,,,,,,,.,,,,,..,.,,.,,,,,,.,.,^!, ::::AAAYYYY\\EeehhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhfhhhhhhMhD\\\[[BB;;=
chr10 251 A 84 .$,$,$..,,,,.,,,,,.,.,,,...,,,.,.,,,,,,,,..,,,,,..,.,,,,,,,.,,,,,..,.,,.,,,,,,.,.,,^!.^!,^!.^!, ???TTTT\\EeehhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhfhhhhhhPhG\\\]]EE>>>EEEE
chr10 252 G 84 .$.$,$,$,,.,,,,,.,.,,,...,,,.,.,,,,,,,,..,,,,,..,.,,,,,,,.,,,,,..,.,,.,,,,,,.,.,,.,.,^!,^!,^!, EEEEUUEeehhhhhhhhhhhKhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhghhhhhhhhhhhhhhXhhhhhheUUUJEEE
chr10 253 C 83 ,$,$.,,,,,.,.,,,...,,,.,.,,,,,,,,..,,,,,..,.,,,,,,,.,,,,,..,.,,.,,,,,,.,.,,,.,.,,,,^!.^!, EEEUUeehhhhhhhhhHhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdhheeeeUUUEE
chr10 254 T 84 .$,$,$,,,.,.,,,...,,,.,.,,,,,,,,..,,,,,..,.,,,,,,,.,,,,,..,.,,.,,,,,,.,.,,,.,.,,,,.,^!.^!.^!, ADDIIUUUUUUUhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhheeeUUEEE
chr10 255 A 83 ,$,$,.,.,,,...,,,.,.,,,,,,,,..,,,,,..,.,,,,,,,.,,,,,..,.,,.,,,,,,.,.,,,.,.,,,,.,..,^!,^!, CCRRRRRRRhh>hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhheeQQQEE
This assumes you want to do it by parsing the pileup, like you said. I believe you can get the same information out of BCF/VCF directly.
Perfect, thanks. This is exactly what I needed. I just did the command "samtools mpileup tmp.bam" to get the kind of output you displayed.