This may be a ridiculously simple question to ask but, I have a compressed genomic VCF file generated by the Strelka germline variant caller, with lines like the following, where no variation was detected:
chr1 27394730 . T . . PASS END=27394756;BLOCKAVG_min30p3a GT:GQX:DP:DPF:MIN_DP 0/0:3070:1137:14:1122
I need to intersect this with a set of regions I'm interested in. I have tried using bedtools intersect
with a suitable BED file, but this only matches the the start of this blocked region at chr1 27394730
and not the remainder of the interval chr1:27394730-27394756
.
Is there a way to run this intersection using bedtools
?
I would think there's a way to do this by converting the gVCF into a BED file (preserving the variations that Strelka has found), but if there's a tool that can do this directly, then please point me in that direction
Thanks. Is there a way of stating which region overlaps with the vcf using
bcftools view
?I'm sort of looking for the kind of output you get with bedtools using the -wa and -wb flags.
not, that's a job for
bcftools annotate
with an annotation as an indexed bed.gz.