Hi all,
I'm trying to filter some VCFs from 1000G by population. For each population I have lists of sample ids and am using the "--keep" command.
Given a single chromosome and a single population, I am able to run the command and get the output I want. However, I am running into issues when I try to iterate using a simple shell script. Ultimately I want to iterate over both populations and chromosomes using a nested for-loop, but at the moment I can't seem to get a simple loop over chromosomes (holding population constant) to work properly.
#!/bin/sh
pop=CDX
for (( i=1; i<=22; i++ ))
do
echo ${i}
../../../../vcftools/bin/vcftools --vcf ALL.chr${i}.phase3_shapeit2_integrated.20130502.snps_indels_svs.genotypes.vcf --keep ../indivs_by_pop/${pop}_phased_indivs.txt --recode --out filter_chrom${i}_${pop}
echo ${pop}
done
This script performs the command for chromosome 1 and then stops -- it does not get to the "echo ${pop}" line, and I have no idea why. If I comment out the vcftools command then it properly iterates 22 times, which suggests to me that the problem is that command rather than the rest of the script. I've been reading through the vcftools documentation to try and figure out where I'm going wrong but to no avail. Any insight would be greatly appreciated!
Thank you! I still wish I knew why my script doesn't work, but it looks like this tool will obviate the need for such scripts in the future.