How to get sequencing depths from VCF with Rsamtools
1
0
Entering edit mode
7.4 years ago
John Ma ▴ 310

I got from our collaborator some pindel output and the BAM files that is used as pindel input. Unfortunately, they used an older of version of pindel (before approximately 0.2.4u) that doesn't record total depth at breakpoint (i.e. the DP field in VCF). While our collaborator's standard practice is to obtain depths through Rsamtools, I haven't found a good way to do this. Specifically, I know how to read sequencing depths to an Rle objects, and on the other end convert pindel outputs to VCF and then to GenomicRanges, but how do I integrate the two?

R pindel Rsamtools • 3.0k views
ADD COMMENT
0
Entering edit mode

"By design, the call annotations in the VCF reflect the data that the caller/genotyper actually used in their calculation. " how could you known the number of reads used by pindel ?

ADD REPLY
0
Entering edit mode

The docs for the current versions of pindel states depth is the same I'd get from samtools depth.

ADD REPLY
0
Entering edit mode
7.4 years ago

I've written a tools to update the DP fields from a set of BAM files: http://lindenb.github.io/jvarkit/FixVcfMissingGenotypes.html

$ find ~/src/gatk-ui/testdata/ -name "*.bam" > input.list

$ tail -2 input.vcf
rotavirus   1064    .   G   A   21.5606 .   DP=250;VDB=2.70971e-16;SGB=8.40135;RPB=0.935144;MQB=1;BQB=0.683886;MQ0F=0;AF1=0.25;G3=0.75,2.37734e-17,0.25;HWE=0.033921;AC1=2;DP4=0,219,0,31;MQ=60;FQ=22.8019;PV4=1,1.22605e-06,1,1    GT:PL   0/0:0,244,70    0/0:0,199,65    0/0:0,217,68    1/1:69,84,0
rotavirus   1064    .   G   A   21.5606 .   DP=250;VDB=2.70971e-16;SGB=8.40135;RPB=0.935144;MQB=1;BQB=0.683886;MQ0F=0;AF1=0.25;G3=0.75,2.37734e-17,0.25;HWE=0.033921;AC1=2;DP4=0,219,0,31;MQ=60;FQ=22.8019;PV4=1,1.22605e-06,1,1    GT:PL   ./. ./. ./. ./.

$ java -jar dist/fixvcfmissinggenotypes.jar -d 50 --fixDP --filtered zz -B input.list input.vcf | tail -2
rotavirus   1064    .   G   A   21.56   .   AC1=2;AF1=0.25;BQB=0.683886;DP=188;DP4=0,219,0,31;FQ=22.8019;G3=0.75,2.37734e-17,0.25;HWE=0.033921;MQ=60;MQ0F=0;MQB=1;PV4=1,1.22605e-06,1,1;RPB=0.935144;SGB=8.40135;VDB=2.70971e-16    GT:DP:PL    0/0:48:0,244,70 0/0:63:0,199,65 0/0:53:0,217,68 1/1:24:69,84,0
rotavirus   1064    .   G   A   21.56   .   AC1=2;AF1=0.25;BQB=0.683886;DP=72;DP4=0,219,0,31;FQ=22.8019;G3=0.75,2.37734e-17,0.25;HWE=0.033921;MQ=60;MQ0F=0;MQB=1;PV4=1,1.22605e-06,1,1;RPB=0.935144;SGB=8.40135;VDB=2.70971e-16 GT:DP:FT:FXG    ./.:48:PASS 0/0:63:zz:1 0/0:53:zz:1 ./.:24:PASS
ADD COMMENT

Login before adding your answer.

Traffic: 2194 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6