I am trying to incorporate variants from multisample vcf file using bcftools consensus as :-
First download variant and reference files as
bcftools view -Oz -r 7:30911853-30925516 "http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20190312_biallelic_SNV_and_INDEL/ALL.chr7.shapeit2_integrated_snvindels_v2a_27022019.GRCh38.phased.vcf.gz">aqp1.1000g.vcf.gz
tabix -p vcf aqp1.1000g.vcf.gz
wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
samtools faidx GRCh38_reference.fa.gz
and then using for each sample .
#!/bin/bash
for sample in `bcftools view -h aqp1.1000g.vcf.gz | grep "^#CHROM" | cut -f10-`; do
bcftools view -c1 -Oz -s $sample -o 1000g.$sample.vcf.gz aqp1.1000g.vcf.gz
tabix -p vcf 1000g.$sample.vcf.gz
samtools faidx GRCh38_reference.fa.gz chr7:30911853-30925516 | bcftools consensus 1000g.$sample.vcf.gz -o 1000g.aqp1.$sample.fasta
done
I am getting error as :-
Note: the --sample option not given, applying all records regardless of the genotype
Warning: Sequence "chr7" not in 1000g.HG00096.vcf.gz
Applied 0 variants
Note: the --sample option not given, applying all records regardless of the genotype
Warning: Sequence "chr7" not in 1000g.HG00097.vcf.gz
Applied 0 variants
Note: the --sample option not given, applying all records regardless of the genotype
Warning: Sequence "chr7" not in 1000g.HG00099.vcf.gz
Applied 0 variants
Kindly help.
this is a repost of your previous question at Concensus from 1000 genome project and the same answer that I provide in that question applies here: you need to make sure that both data files use the same chromosome names. the error message is telling you that your file 1000g.HG00099.vcf.gz does not have the sequence name chr7
Where in vcf file should I change the chromosome name to chr7 as it is multisample vcf file and what about the error sample option not given.
I wrote an answer to this post with an example