Sorry if this might be a trivial question!
I read a lot about this until I got lost. I need to download wgs VCF file from the 1000 genomes ftp site. I need the snps (snvs and indels), most importantly, I need to have the individual genotypes of all the persons involved.
so for example, this file :
which was referenced many times on biostars, does not contain individual genotypes. I need something similar to what those files contain. Is there one global file containing snps/indels for wgs data including genotypes of the various samples ?
thanks!
I can't get the files, it says the host is not resolvable. I tried also from the NCBI website, none of the pages can be opened. Is there another way to download the human vcf files directly from the terminal?
I can connect - I did it just now 10 seconds ago. To where are you downloading the data?
I tried several, the only page that opened is
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
. The other wereftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/
and from the links provided fromhttps://www.ncbi.nlm.nih.gov/variation/docs/human_variation_vcf/
,http://www.internationalgenome.org/data#download
.The direct links are: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr2.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr3.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr4.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr6.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr7.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
et cetera
...and, the tab-index files:
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr2.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr3.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi
et cetera
thank you, but also these are giving me time out errors. But it worked really fast with
$ wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr21.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi
. Is there some problems with the server? Are these sites pointing at the same data?That file is a tab-index file, which is very small; so, it will download very quickly in most places unless you are using a dial-up modem of 7.5kbps (or less).
Let's just try chr1 variants, first: