Understanding Mpileup Alternate Allele Coding
3
0
Entering edit mode
12.1 years ago
Joseph Hughes ★ 3.0k

Hi,

I am trying to count all the bases found at a specific site within a bam file using the following command:

samtools mpileup -Ef reference.fa  -r chr2:200-200 aln.bam

This produces the following output:

chr2   200    C    2130    .$.$.$.$..,,,,$............,.........T.....................,.,,,,,,.,.........................................................................A...............................,,.*...*.................,,,,,*............A*.........................................................................*..................................................................................................................................................,*,,,.....,......,.....,..................................................................,.,..*,.,,..*.,*,,,,................,........,..................,.......*...................,,,,,.,,,,,*..................................................,.,,,,,,.,,*,,,.,,,,...............,.......................................................T...............................................................,...........,.,.........,..,,*,,,.,,,,,..................................,.A...................................................................................................................................................,..................................,...............................................,...,.............,..,...,,,......,............,....,.......................................................................,,,,.,,,,,,,...........,..................,*..,,,,,,.,.............................................................,................,...........................................A.....................................................................,..,,..*,,,,,................,,............,..,...................,.........,,,,,.,,.,.........................................................................................................................................,............................,................,............,.,.,,,......,,,,,,,,,,,,,,,,,,........................,,..,............,.,,......,,,.,,,,,,,,,.........,.,,,,,,,,,,,,.,,,,,,.,,............................,,,,,,,,,.,,,,,,,,,,,,,...........T.........,......,......^~.^~.^R.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^>,^~.^\.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~,^~.^~.^~.^~.^~.^~,^~.^~.^~.^~.^_.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.^[.^~.^~.^~.^~.^~.^~.^~.^b.^~.^~.^~.^=,^~.^~.^~.^~.^~.^~.^~.^~.^~.^~,^~.^~.^~.^~.^P,^~.^~.^~.^~.^X.^~.^~.^>.^~.^~.^~.^~.^~.^~.^~.^~.^~.^~.

I realize that A,T correspond to alternate alleles and that . and , correspond to the reference allele depending on strand but what do all the other symbols correspond to?

Cheers, Joseph

samtools mpileup snp • 3.1k views
ADD COMMENT
1
Entering edit mode
12.1 years ago
matted 7.8k

Read the documentation, it's all answered there (the paragraph beginning "In the pileup format").

There are other Biostar questions on the same topic as well, like Some Help Understanding With Mpileup Output.

ADD COMMENT
1
Entering edit mode
12.1 years ago

see the code of samtools/bamplcmd.c in the function pileupseq

for example (in a deletion): if the position is a "reference skip" and the strand is '+', the symbol will be '>'

p->is_refskip? (bam1_strand(p->b)? '<' : '>') : '*')
ADD COMMENT
1
Entering edit mode
12.1 years ago
Joseph Hughes ★ 3.0k

A deleted base is represented by *. $ is for the end of a read. A symbol ‘^’ marks the start of a read. The ASCII of the character following ‘^’ minus 33 gives the mapping quality. ~ refers to ASCII 126, i.e. mapping quality of 93. \ refers to ASCII 92, i.e. mapping quality of 59

ADD COMMENT

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6