Entering edit mode
4.9 years ago
dec986
▴
380
Following advice from Switch ref/alt alleles vcf file
I am attempting to switch alt and ref alleles in a file with 659,535 sites.
I am running thus, as suggested by the tool itself:
bcftools +fixref vcf/HG00101.bb.include_variants.vcf.gz -Ob -o tmp.vcf.gz -- -f human_g1k_v37.fasta -i IM.NA19900.SM.vcf.gz
and the output is
# SC, guessed strand convention
SC TOP-compatible 1
SC BOT-compatible 1
# ST, substitution types
ST A>C 0 -nan%
ST A>G 0 -nan%
ST A>T 0 -nan%
ST C>A 0 -nan%
ST C>G 0 -nan%
ST C>T 0 -nan%
ST G>A 0 -nan%
ST G>C 0 -nan%
ST G>T 0 -nan%
ST T>A 0 -nan%
ST T>C 0 -nan%
ST T>G 0 -nan%
# NS, Number of sites:
NS total 659352
NS ref match 0 -nan%
NS ref mismatch 0 -nan%
NS flipped 0 -nan%
NS swapped 0 -nan%
NS flip+swap 0 -nan%
NS unresolved 0 -nan%
NS fixed pos 0 -nan%
NS skipped 659352
NS non-ACGT 0
NS non-SNP 659352
NS non-biallelic 0
so almost every site is skipped, why?
but the output bcf tmp.vcf
is displaying the exact same data that I entered in, the alternate alleles and IDs are missing.
how can I run bcftools to add the alternate alleles and IDs according to IM.NA19900.SM.vcf.gz
?
to both input vcf files use the same notation for the chromosomes : 'chr1' vs '1' ?
the chromosomes all match
I thought that you can do this with
bcftools norm
and the following option:unfortunately, this still keep things in gVCF format :(