Question

1000 genome download

1

Entering edit mode

4.6 years ago

brendaumoh6 ▴ 10

Please I need directive on how to download the phase3 1000 genome of African population

gene • 4.3k views

ADD COMMENT • link updated 4.6 years ago by chrchang523 11k • written 4.6 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

Did you take a look at the FAQ provided by 1000 genomes project?

ADD REPLY • link 4.6 years ago by GenoMax 147k

0

Entering edit mode

Yes,I did but all I saw was values, I dont really know which is for which population.

ADD REPLY • link 4.6 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

https://www.internationalgenome.org/faq/can-i-get-genotypes-specific-individualpopulation-your-vcf-files/

ADD REPLY • link 4.6 years ago by ATpoint 85k

score 0 · Answer 1 · 2020-04-28

Hey brendaumoh6,

If you follow steps 1-5 of my tutorial ( Produce PCA bi-plot for 1000 Genomes Phase III - Version 2 ), you will have the entire phased 1000 Genomes Phase III dataset on your disk, which can be used time and time again for future analyses. Information about the African population will be in the PED file that you also download - this can be used to filter the data for just the African samples.

Unfortunately, I am not aware of anybody who has split the 1000 Genomes data into the individual population groups. It's likely something that I would do if I actually had a tenured academic position.

Kevin

ATpoint · Answer 2 · 2020-04-28

0

Entering edit mode

4.6 years ago

chrchang523 11k

As others have noted, the primary-source way to do this is to use a pedigree file provided by 1000 Genomes to filter the full dataset down to just the African samples of interest (which correspond to a superpopulation of "AFR").

A quick alternative is to use the plink2-format fileset posted at https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 . This includes SuperPop and Population annotations for each sample, so the following command line extracts just the African samples (assuming the .pvar file is still compressed, that's what the 'vzs' refers to):

plink2 --pfile all_phase3 vzs \
       --keep-cat-pheno SuperPop \
       --keep-cat-names AFR \
       --make-pgen \
       --out afr_phase3

and you can convert to BCF format with

plink2 --pfile afr_phase3 \
       --export bcf

ADD COMMENT • link 4.6 years ago by chrchang523 11k

0

Entering edit mode

Thanks for your respond. I have downloaded the phase3_corrected.psam\?dl\=1 file from plink2 website. I ran the command line :

plink2 --pfile all_phase3 vzs \
       --keep-cat-pheno SuperPop \
       --keep-cat-names AFR \
       --make-pgen \
       --out afr_phase3
But I got a debug message:
Start time: Wed Apr 29 11:14:10 2020
193440 MiB RAM detected; reserving 96720 MiB for main workspace.
Using up to 16 threads (change this with --threads).
Error: Failed to open all_phase3.pvar.zst?dl=1.pgen : No such file or
directory.

How do I resolve this is issue?

ADD REPLY • link updated 4.6 years ago by ATpoint 85k • written 4.6 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

After downloading, you need to rename phase3_corrected.psam to all_phase3.psam, to match the other two files; sorry about not explicitly stating this in the initial answer.

ADD REPLY • link 4.6 years ago by chrchang523 11k

0

Entering edit mode

Same error message output but this time no such directory ".pgen"

Start time: Wed Apr 29 14:59:09 2020
193440 MiB RAM detected; reserving 96720 MiB for main workspace.
Using up to 16 threads (change this with --threads).
Error: Failed to open all_phase3.pgen : No such file or directory.
End time: Wed Apr 29 14:59:09 2020

ADD REPLY • link updated 4.6 years ago by GenoMax 147k • written 4.6 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

To help, please confirm your PLINK version, and always show the exact commands that you are using. Also confirm that files are in the directories where they are supposed to be in relation to the command(s) that you are running.

ADD REPLY • link 4.6 years ago by Kevin Blighe 88k

0

Entering edit mode

Did you download and decompress the .pgen file from the website? You need to follow the instructions on that page.

ADD REPLY • link 4.6 years ago by chrchang523 11k

0

Entering edit mode

Thank you all,it finally worked. I renamed my .pvar from this all_phase3.pvar.zst?dl=1 to all_phase3.pvar.zst. I also decompressed the .pgen file.

ADD REPLY • link 4.6 years ago by brendaumoh6 ▴ 10

0

Entering edit mode

Out of curiosity, what browser are you using on what operating system, and how are you clicking on the links to download the files? When I click on the links with either Chrome, Firefox, or Safari, across multiple computers, the saved files do not have "?dl=1" at the end of the names.

ADD REPLY • link 4.6 years ago by chrchang523 11k

0

Entering edit mode

It seems that it was likely wget. I just tried via wget and it saves it as per the user reported:

wget https://www.dropbox.com/s/qv61mgtx6pz54fz/chr1_phase3.pgen.zst?dl=1

Works via the browser though.

ADD REPLY • link 4.6 years ago by Kevin Blighe 88k

0

Entering edit mode

Am using linux OS, with firefox. Though I had the 'dl=1' attached to my file I renamed the file after downloading it on linux.

ADD REPLY • link 4.6 years ago by brendaumoh6 ▴ 10