Tool that can merge 2 VCF files while taking "representational ambiguity" of (multi-allelic) variants into account
0
0
Entering edit mode
2.2 years ago
William ★ 5.3k

Is there a tool that can merge 2 VCF files while taking "representational ambiguity" of multi-allelic variants into account?

By:

  • replaying all variant alleles from the 2 VCF files into the reference genome
  • identifying which alleles are actually the same but just written down in a different way
  • calculating what the best way is to represent the merged variants/alleles in a new (multi-allelic) variant

See also this question and answer. Should you decompose and normalize multi-allelic variants for comparison / ID assignment?

The (multi-allelic) variants (alleles) in both VCF files are different because:

  1. different technology used to make the VCF files
  2. different alternative alleles present in samples

BCFtools merge does not take "representational ambiguity" of variants into account (as far as I know)

First decomposing and normalizing all variants to bi-allelic in both input VCF files, then merging and collapsing overlapping variants back to multi-allelic destroys some information?

vcf bcftools • 682 views
ADD COMMENT
0
Entering edit mode

First decomposing and normalizing all variants to bi-allelic in both input VCF files, then merging and collapsing overlapping variants back to multi-allelic destroys some information?

Can you give me an example? Your three point thing up top is a description of the left-aligned most parsimonious representation done by vt normalize/bcftools norm (I prefer the former)

ADD REPLY

Login before adding your answer.

Traffic: 1714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6