Calculating effective coverage/callable genome for variant calling from a bam file
1
0
Entering edit mode
5.3 years ago
Rubal ▴ 350

We have mapped whole-genome sequence data in bam format. We are testing a variety of variant callers to identify mutant sites. The variant callers we are using have various different filters that results in sites being excluded from variant calling. For example some variant callers require >=3 reads to call a variant. Another excluded reads with >3 mismatches to the reference genome.

We would like to know what is the effective coverage that a variant caller sees for variant calling, given that many sites and reads are excluded by such filters. This would enable us to calculate our sensitivity to call variants across the genome and identify regions where no variants can be called.

We think the best way to do this would be a script or tool that reads a bam file and outputs a bigWig file with the effective coverage across the genome based on applying these filters. This could be adapted with new filters depending on the variant caller. Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?

Thanks in advance for any help and suggestions.

genome bam bigwig coverage variant-calling • 1.5k views
ADD COMMENT
1
Entering edit mode
5.3 years ago

Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?

create a first bed file with the coverage depth >X

samtools depth in.bam | awk -F '\t' '{if($3>10) printf("%s\t%d\t%s\n",$1,int($2)-1,$2);}' | bedtools merge > 1.bed

create a second bed file with NM < X

 java -jar dist/samjdk.jar --samoutputformat BAM  -e 'return !record.getReadUnmappedFlag() && (record.getIntegerAttribute("NM")==null || record.getIntegerAttribute("NM") < 10);' src/test/resources/S1.bam | bedtools  bamtobed > 2.bed

combine both bed files with 'bedtools intersect'

ADD COMMENT
0
Entering edit mode

Thanks very much Pierre, I'll give this a go! For clarificaiton will this only result in a list of regions or will it also give the effective coverage at each site?

ADD REPLY

Login before adding your answer.

Traffic: 2053 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6