Hi, I'm performing QC steps of Andries T. Marees GWA tutorial, currently I'm stuck at 7th step where you should begin the population stratification downloading a 61GB vcf.gz file of 1000genomes containing genetic data of 629 individuals from different ethnic backgrounds. Successively variants should be extracted with PLINK to obtain bim-fam-bed files.
wget ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz
During the extraction process I learn the file is corrupted; I tried to unzip it as well but with no success. I tried to download it 4 times but it is always corrupted. Has someone found the same issue?
Tutorial: https://github.com/MareesAT/GWA_tutorial
Article: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6001694/
Maybe your download is broken because of your connection, you can resume it with
wget -c
using the same link.I had the same idea, to be sure I did the tests in different places with different computers and connections... same outcome
The file hasn't changed since 2010, it's not the file. Try to dl from a browser, from there:
https://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20100804/
I tried with the same link, it's your connection speed.