Where do I get a large reference VCF?
1
0
Entering edit mode
19 months ago
a5864557 • 0

I would like to download a large .vcf file containing many (hundreds or thousands) of samples. Ideally, I would download different population-specific .vcf files, but the ability to sort/filter by ancestry group is fine. Where do I get such a file? I prefer GRCh37 for consistency with other files I'm using.

1000genomes vcf genome reference-genome ukbiobank • 890 views
ADD COMMENT
0
Entering edit mode

I've previously used https://bochet.gcc.biostat.washington.edu/beagle/1000_Genomes_phase3_v5a/b37.vcf/, but this doesn't appear to contain ancestry information. I know information about the samples ("HG00566" or "NA19758") exists out there somewhere, particularly here: internationalgenome.org/data-portal/sample. You can then filter the samples.

ADD REPLY
1
Entering edit mode
19 months ago

https://www.cog-genomics.org/plink/2.0/resources#phase3_1kg ; you can select the GRCh37 version, and then use plink2 to export VCF(s). The .psam file contains population labels.

ADD COMMENT

Login before adding your answer.

Traffic: 2519 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6