samtools mpileup - how to extract the alternative allele counts for *each* alternative base
2
4
Entering edit mode
8.7 years ago
Michi ▴ 990

Hello there

I am running samtools mpileup and get an output for the requested positions. My problem is that if there are multiple alternative bases, it's not possible to extract the alternative base count for each.

The command line that I use:

samtools mpileup -u -Q 0 -t DP4 -t DV -t DP -f GRCh37.fa  --positions positions.bed SAMPLE_1-ready.bam SAMPLE_2-ready.bam  SAMPLE_3-ready.bam | bcftools view  > SAMPLE-variant-sites.vcf

An example output:

12  25398284    .   C   G,T,<X> 0   .   DP=612;I16=351,184,53,24,19737,1.05346e+06,2839,140417,32100,1.926e+06,4620,277200,9893,221105,1557,36041;QS=2.58612,0.40494,0.0089373,0;VDB=0.0247352;SGB=4.66664;RPB=0.97648;MQB=1;MQSB=1;BQB=0.965306;MQ0F=0 PL:DP:DV:DP4    0,255,255,255,255,255,255,255,255,255:208:3:133,72,2,1  255,0,255,255,255,255,255,255,255,255:186:74:74,38,51,23    0,255,255,255,255,255,255,255,255,255:218:0:144,74,0,0

As you may see, there are G as well as T alternative reads . And if we look at the data from SAMPLE_1, we can see that there is only one DP4 field: 133,72,2,1. So we know that three bases are either G or T in that position but we do not know how bases are G and how many are T

How would I achieve this??

samtools mpileup alt mutation • 11k views
ADD COMMENT
5
Entering edit mode
8.7 years ago

You'll need to update to the latest samtools/bcftools (1.3) and then there are new mpileup tag (-t) options -- AD, ADF, ADR which will give per-allele depth (AD) per-allele depth on the forward (ADF) and reverse (ADR) strands.

ADD COMMENT
0
Entering edit mode

Great! You saved my day. I was still using v1.2 and could not figure out how to get this info.

ADD REPLY
0
Entering edit mode

By the way - did you ever notice problems for calling the correct AD for deletions? I experience many cases where the AD is equal to DP meanwhile in truth the deletion is present in only a fraction of the reads. It puzzles me as insertions seem to be called correctly and other deletions as well.

ADD REPLY
0
Entering edit mode
4.7 years ago
Ks-seq ▴ 10

If I can ask: How get Allele fraction= reads supporting a variant (AD) / total reads (DP) ? and possibly in percentage ?

ADD COMMENT

Login before adding your answer.

Traffic: 2512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6