Hello all,
I wanted to know how can I extract separate sample vcf files for each sample from a multi sample vcf file containing data of more than 1000 samples.
Thank you
Hello all,
I wanted to know how can I extract separate sample vcf files for each sample from a multi sample vcf file containing data of more than 1000 samples.
Thank you
You can use bcftools view with -s parameter. https://samtools.github.io/bcftools/bcftools.html#view
bcftools query -l
prints all the samples which you can loop over. Here is an example snippet I use. It gonna create a directory and write each individual sample with their sample name in to that directory.
MULTISAMPLEVCF="mymultisample.vcf.gz"
OUTDIR="splitsamples"
mkdir -p "$OUTDIR"
for sample in $(bcftools query -l "$MULTISAMPLEVCF"); do
echo "$sample"
bcftools view -c 1 -s $sample $MULTISAMPLEVCF -Oz -o "$OUTDIR/$sample"
done
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Splitting vcf files to individual samples ; Split a VCF file into individual sample files ; split vcf by individual ; split vcf file into multiple files ; etc... etc...