Hi guys,
Sorry if this is a basic question... but I've searched all around the internet and can't found any answer to my question.
I am working on a Genotype data from illumina genotyping chip (so SNPs information of ~100 individuals), I've got a VCF file (converted from PLINK file)
I'm following a pipeline, and the pipeline goes through a step where it uses bcftools +fixref to "Fix REF allele according to GRCh37"
bcftools +fixref test.bcf -Ob -o output.bcf -- -f ref.fa -m top
The problem is that I don't understand what's the importance of doing this?
the bcftools manual states (regarding the above code): "If the output shows that the VCF is TOP-compatible, the following command can be used to fix the strand"
---> But what needs fixing?? considering that I have converted all my SNPs into the positive strand, I simply don't know what this code does and why is it important
Note: Technically, I can just blindly follow the pipeline without understanding what it is doing, but I'm really trying to understand what I'm doing here, so any helps are appreciated :)
Here's the output of the code: