I previously asked this question here: Reading vcf file using python gives UnicodeDecodeError
and the answer was to use vcftools
instead of bcftools
while merging files (the file I was trying to read is a merged file). At the moment this fixed my issue and I moved forward.
I revisited the files to know why one of them worked and the other did not work!
I found that; when using bcftools
to merge the three files it gave this warning:
[W::bcf_hdr_merge] Trying to combine "PS" tag definitions of different types
.
As a result the file contains strange characters
GT:GQ:DP:GL:PS:ADALL:AD 0|1:415:.:-57.1927,0,-41.4646:<8D>^L:.:. GT:GQ:DP:GL:PS:ADALL:AD 1|0:346:.:-56.3461,0,-34.6188:<8D>^L:.:. ./.:.:.:.:^A:.:.
As you can see <8D>^L
and ^A
should not be in the output.
When I visited the same locations but when using vcftools for merging :
GT:GQ:DP:GL:PS:ADALL:AD 0|1:.:.:.:-57.192706988686034,0.0,-41.46462464424806:415:68749
Is there a way to change this behavior in bcftools? or it is an issue in implementation?
Thanks
what is the definition of BOTH vcf header files for
##FORMAT=<ID=PS...
..