Hi,
I have a vcf file with phased data ("dataset1"), which I want to analyse together with some other genotype data ("dataset2").
For some loci in dataset1, the ref/alt alleles are opposite to those in dataset2, i.e. for a given SNP I get A/G and G/A respectively.
I have two questions:
Is there any way I can either
(i) check quickly the ref/alt consistency across all my loci in the two datasets and ideally remove all inconsistent positions? Would the --diff-site-discordance
flag from vcftools perform something like that?
or
(ii) swap the ref/alt information for the SNPs of my choice directly on the vcf files? I want to avoid converting to plink because I don't want to lose the phase information
Any ideas will be very much appreciated.