Entering edit mode
4.0 years ago
elisheva
▴
120
Hi all,
I would like to examine whether the frequencies of many alleles differ between the African and North European populations.
I downloaded the 1000 genome project wgs vcf file via the FTP site:
http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz
But the allele frequencies are given for super populations, thus the European population includes Spain and Italy.
Does anyone familiar with this kind of data and could help?
Thanks
Thanks for your response.
It seems that the versions of the vcf file and the pedigree file are different.
I couldn't find the matching version for both files.
Is there is any option to solve this?
The pedigree file contains some samples that were excluded from the phase 3 VCF, but that's okay, the extra sample IDs will just be ignored by --keep.
Apparently, there are no sample Ids in the vcf file I posted, and I couldn't find any others.
Oh, that's because you downloaded the sites VCF instead of the full genotypes VCF. Download a .genotypes.vcf.gz file from http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ instead, or the plink2-formatted version from https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 .