We have mapped whole-genome sequence data in bam format. We are testing a variety of variant callers to identify mutant sites. The variant callers we are using have various different filters that results in sites being excluded from variant calling. For example some variant callers require >=3 reads to call a variant. Another excluded reads with >3 mismatches to the reference genome.
We would like to know what is the effective coverage that a variant caller sees for variant calling, given that many sites and reads are excluded by such filters. This would enable us to calculate our sensitivity to call variants across the genome and identify regions where no variants can be called.
We think the best way to do this would be a script or tool that reads a bam file and outputs a bigWig file with the effective coverage across the genome based on applying these filters. This could be adapted with new filters depending on the variant caller. Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?
Thanks in advance for any help and suggestions.
Thanks very much Pierre, I'll give this a go! For clarificaiton will this only result in a list of regions or will it also give the effective coverage at each site?