Obtain population information of 1092 sample of the 1000 genome project
1
0
Entering edit mode
4.8 years ago
lxiao63 • 0

I have downloaded genomic data for 1000 g phase I samples from https://www.ncbi.nlm.nih.gov/projects/faspftp/1000genomes/.

I checked the resultant .FAM file (1092 rows, each corresponds to 1 sample in 1000 g phase I release) and noted that there is a column named member whose first 20 cases are :

HG00096 HG00097 HG00099 HG00100 HG00101 HG00102 HG00103 HG00104 HG00106 HG00108 HG00109 HG00110 HG00111 HG00112 HG00113 HG00114 HG00116 HG00117 HG00118 HG00119

I wish to determine the population (eg, CHB, JPT, CEU) and super population (eg, EAS, EUR, AFR) from the member IDs. To do so, I downloaded pedigree file from https://www.internationalgenome.org/faq/can-i-get-phenotype-gender-and-family-relationship-information-samples/.

The pedigree file has 3501 rows rather than 1092 rows. This file has a column namded Individual ID whose contents are: HG01879, HG01880, HG01881, etc. However, none of the member in my .FAM file can be found among the 3501 rows of the pedigree file! These two files are completely irrelevant.

I would like to ask if it is possible to determine population source of the 1092 1000 g samples from their member ID. If yes, where could I find such meta data that relates ID to population source?

Thank you.

1000 genome project • 937 views
ADD COMMENT
0
Entering edit mode
4.8 years ago
JC 13k

You can use the 1000Genomes Data portal

ADD COMMENT

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6