Entering edit mode
2.7 years ago
Sabeen
▴
30
Hello everyone,
I am doing DNA seq germ line mutation analysis. I have used fastqc, cutadapt , bwa , samtools and gatk for most of the analysis.
The problem I am facing is when I calculate per base read depth by "samtools depth" it is different than the read depth values in the vcf file ( DP).
For example for the same sample the values are like that (C = read depth from samtools depth)
A B C D
Chr no.1 2939189 716 DP=71
Chr no.1 3140625 11 DP=1
Chr no.1 5983221 128 DP=50
I have used following commands
samtools depth -b sorted.bam > coveragefile
for vcf
gatk HaplotypeCaller -R ref.fa -I file.bam -O output.vcf
I would really appreciate if someone can tell me why I am getting this difference and which read depth values I should take.
Thanks
This is probably related with how both samtools and GATK deal with duplicate reads, multi-mapping reads... etc. Most likely GATK is a way more restrictive than samtools. In addition, HaplotypeCaller performs a re-assembly of the reads before calling the variants, therefore the "mapping" of the reads that samtools is using and GATK might not be 100% equal.