How to set variant FILTER in a VCF file based on overlap with regions in a BED file
2
0
Entering edit mode
3.6 years ago
William ★ 5.3k

How to set variant FILTER in a VCF file based on overlap with regions in a BED file?

I guess this should be possible using e.g. bcftools

But the bcftools filter command can't do this annotation direct.

http://samtools.github.io/bcftools/bcftools.html#filter

Seems like I first need to annotate the overlap in an INFO field.

http://samtools.github.io/bcftools/bcftools.html#annotate

Is there a way to set the FILTER status of a variant in 1 single command (without adding an INFO field) based on overlap of the variant with a region in a BED file?

Using bcftools or any other tool?

annotation bed vcf • 3.9k views
ADD COMMENT
2
Entering edit mode
3.6 years ago

I wrote http://lindenb.github.io/jvarkit/VCFBedSetFilter.html

$java -jar dist/vcfbedsetfilter.jar --filter MYFILTER --blacklist in.bed in.vcf

ADD COMMENT
1
Entering edit mode
3.6 years ago
William ★ 5.3k

I figured out how to do the annotation using BCFTools. 2 steps are needed.

Input BED file requires 1 for each region where the annotation should be set

Chr_01 1000 2000 1
Chr_05 5000 6000 1

Input header file:

##INFO=<ID=BAD_REGION,Number=0,Type=Flag,Description="My bad region for some reason">

bgzip and tabix the bed file

bgzip bad_regions.bed
tabix -p bed regions.bed.gz

First bcftools command to set the INFO field

bcftools annotate -a bad_regions.bed.gz  -h bad_regions.bed.hdr -c CHROM,FROM,TO,BAD_REGION  input.vcf.gz  -Oz -o output_w_INFO.vcf.gz

Second bcftools command to set the FILTER based on the INFO field

bcftools filter output_w_INFO.vcf.gz  -s BAD_REGION -m+ -e 'INFO/BAD_REGION=1' -Oz -o output_w_INFO_w_FILTER.vcf.gz
ADD COMMENT
0
Entering edit mode

I tested and I think the last filtering should be:

bcftools filter output_w_INFO.vcf.gz  -s BAD_REGION -m+ -e 'INFO/BAD_REGION=1' | bcftools view -f PASS  -Oz -o output_w_INFO_w_FILTER.vcf.gz

In order to remove the BAD_REGION annotated SNPs.

ADD REPLY
0
Entering edit mode

I wanted to keep the BAD_REGION SNPs, but just tabel these SNPs.

If you just want to remove these SNPs, than it's much more straightforward to do something like bedtools subtract -a input.vcf -b bad_region.bed

ADD REPLY

Login before adding your answer.

Traffic: 1829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6