Working with 1000G data files
1
0
Entering edit mode
10.4 years ago
goodcow ▴ 20

I'm trying to work with files downloaded from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521

What tools do I need to work with these files? I believe it is vcftools and tabix, right? The documentation is not clear at all.

1000genomes vcftools tabix • 3.3k views
ADD COMMENT
1
Entering edit mode

Maybe you could explain what you're trying to achieve? "Working with files" sounds very vague.

ADD REPLY
0
Entering edit mode

it depends on what you want to do. VCFtools will be good for most things.

ADD REPLY
0
Entering edit mode

Do I have to unzip the .vcf file before using vcftools?

ADD REPLY
1
Entering edit mode
10.4 years ago
1234Jc4321 ▴ 450

http://www.1000genomes.org/using-1000-genomes-data

look at the file intitle "The 1000 Genomes Tools"

It's pretty clear.

ADD COMMENT
0
Entering edit mode

I'm going to be working with SNPs. To clarify, this index has the SNP files, correct? ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20110521/

ADD REPLY
0
Entering edit mode

yes. like Phillip told you in your other post.

ADD REPLY
0
Entering edit mode

Does the .ALL.wgs...vcf.gz contain ALL the SNPs? If so, what's the point of the .ALL.chr1, chr2, ... chrX.vcf.gz files? I think that's what I really don't understand. Thanks for your help.

ADD REPLY
0
Entering edit mode

As you can see from the names, the file

ALL.wgs.phase1_release_v3.20101123.snps_indels_sv.sites.vcf.gz

only contains the variant sites, whereas the files

ALL.chr1.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz

etc.

contain all the genotypes for the 1000 Genomes individuals.

ADD REPLY

Login before adding your answer.

Traffic: 2736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6