snps/indels with individual genotypes from 1000 genomes ftp site
1
1
Entering edit mode
6.4 years ago
lait ▴ 180

Sorry if this might be a trivial question!

I read a lot about this until I got lost. I need to download wgs VCF file from the 1000 genomes ftp site. I need the snps (snvs and indels), most importantly, I need to have the individual genotypes of all the persons involved.

so for example, this file :

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz

which was referenced many times on biostars, does not contain individual genotypes. I need something similar to what those files contain. Is there one global file containing snps/indels for wgs data including genotypes of the various samples ?

thanks!

1000 genomes ftp genotypes vcf • 3.0k views
ADD COMMENT
6
Entering edit mode
6.4 years ago

You can download the entire data per chromosome (chr1-22 & chrX) —including individual genotypes for both indels and SNPs— using this code:

prefix="ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr" ;

suffix=".phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" ;

for chr in {1..22} X; do
    wget $prefix$chr$suffix $prefix$chr$suffix.tbi ;
done

From: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format

Kevin

ADD COMMENT
0
Entering edit mode

I can't get the files, it says the host is not resolvable. I tried also from the NCBI website, none of the pages can be opened. Is there another way to download the human vcf files directly from the terminal?

ADD REPLY
0
Entering edit mode

I can connect - I did it just now 10 seconds ago. To where are you downloading the data?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

thank you, but also these are giving me time out errors. But it worked really fast with $ wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr21.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz.tbi. Is there some problems with the server? Are these sites pointing at the same data?

ADD REPLY
0
Entering edit mode

That file is a tab-index file, which is very small; so, it will download very quickly in most places unless you are using a dial-up modem of 7.5kbps (or less).

Let's just try chr1 variants, first:

wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz
ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6