Okay, so I have an idea of what needs to be done: I need to download a vcf for the region corresponding to each snp, I'm saving these all in a directory per chromosome, which I'm then intending to merge together using vcf-merge, and then each chromosome will be concatenated together using vcf-concat to form one file.
The first part is fine, I can download all the regions without a problem, however I get an error whilst merging. For example, lets say I want two snps and use tabix to get them:
./tabix -f -h ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz 1:145604788-145604789 | perl vcf-subset -c EUR.samples.list | ./bgzip -c > A.vcf.gz
./tabix -f -h ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz 1:145604791-145604791 | perl vcf-subset -c EUR.samples.list | ./ bgzip -c > B.vcf.gz
So I'm now left with 3 files; A.vcf.gz, B.vcf.gz, and an index file. I want to merge A and B so I run:
perl vcf-merge A.vcf.gz B.vcf.gz > C.vcf
However I get the error
[main] fail to load the index file.
The command "tabix -l A.vcf.gz" exited with an error. Is the file tabix indexed?
at Vcf.pm line 172.
Vcf::throw('Vcf4_1=HASH(0x107a048)', 'The command "tabix -l A.vcf.gz" exited with an error. Is the ...') called at Vcf.pm line 2673
VcfReader::get_chromosomes('Vcf4_1=HASH(0x107a048)') called at vcf-merge line 197
main::init_cols('HASH(0x10b7ab0)', 'Vcf4_2=HASH(0x1079d78)') called at vcf-merge line 279
main::merge_vcf_files('HASH(0x10b7ab0)') called at vcf-merge line 1
I see that it's missing an index, however http://samtools.sourceforge.net/tabix.shtml says that it creates an index file only when the position is missing from the command.
What am I doing wrong here? I feel like I'm probably missing out a step or something...
Edit: Oops, I think I put this in the wrong section, this should be a comment, not an answer!
Genotypes:
http://www.1000genomes.org/faq/can-i-get-genotypes-specific-individualpopulation-your-vcf-files
convert to plink ped/map format:
http://www.1000genomes.org/faq/can-i-convert-vcf-files-plinkped-format
and haplotypes:
http://www.1000genomes.org/faq/can-i-get-haplotype-data-1000-genomes-individuals
Sorry, I should have added that I know how to do this for individual snps, I was just wondering if there existed a tool that would do it for me en masse.