Hey everyone,
I am trying to split the multiallelic sites of my VCF. I used bcftools norm --m-any
. However, the result is not really reasonable to me. Here's an example.
Let's say, I have this multiallelic site:
REF ALT GT1 GT2 GT3
A C,G 1/2 0/2 0/1
After splitting I get these two:
REF ALT GT1 GT2 GT3
A C 1/0 0/0 0/1
A G 0/1 0/1 0/0
So, the results for the "unused" ALT
allele for a specific row is just set to REF
. Is there a way to change this behavior, since I don't think it's reasonable to do it this way, at least for my analysis. I would like my result to be more like this:
REF ALT GT1 GT2 GT3 GT1 GT2 GT3
A C 1/. 0/. 0/1 or ./. ./. 0/1
A G ./1 0/1 0/. ./. 0/1 ./.
Or similar. At least I don't want to have REF
where there was an ALT
before.