comparing variants between two VCF files
1
1
Entering edit mode
3.3 years ago
BDK_compbio ▴ 140

I have two VCF files (e.g. SV1.vcf.gz, SV2.vcf.gz) and a bed file (reg.bed). I would like to compare the variants among them in the BED regions. The comparison includes the common variants and unique variants present in SV1 and SV2.

I am currently doing this with several steps like the followings

tabix -R reg.bed SV1.vcf.gz > SV1.BR.vcf.gz  
tabix -R reg.bed SV2.vcf.gz > SV2.BR.vcf.gz

It seems the above two steps do not retain the headers, so I needed to extract the headers and append them at the beginning and then use bgzip and tabix. For finding the common and unique variants among these two, I used the following

bcftools isec -n~11 -c all SV1.BR.vcf.gz SV2.BR.vcf.gz > common.txt
bcftools view -T common.txt SV1.BR.vcf.gz -Oz > SV1.unique.txt
bcftools view -T common.txt SV2.BR.vcf.gz -Oz > SV2.unique.txt 

I am just wondering if there exist any other tools for this without running the script with the above commands. Any tools that also produces some good visualizations will be really useful.

VCF SV BED • 2.2k views
ADD COMMENT
1
Entering edit mode

tabix -h will include the header

ADD REPLY
0
Entering edit mode
3.3 years ago

You may want to look at https://github.com/asntech/intervene We use this to produce VCF UpSet plots for benchmarkingenter image description here

intervene upset --figtype png --type genomic -i vcf1 vcf2 vcf3 --save-overlaps --filenames --bedtools-options header

ADD COMMENT

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6