Question

Where Can I Download Vcf Files For Publicly Available Data?

6

Entering edit mode

13.3 years ago

Kevin ▴ 640

I have the 1000 genomes VCF, but I am wondering if there are VCF files avail for other genomes like

Korean genomes
African genomes
Venter
Watson

Cheers

vcf snp • 21k views

ADD COMMENT • link updated 2.0 years ago by Ram 45k • written 13.3 years ago by Kevin ▴ 640

score 3 · Answer 1 · 2011-12-12

3

Entering edit mode

13.3 years ago

Zev.Kronenberg 12k

Complete genomics has some publicly available datasets. I am sure there is a converter to VCF. If you have an FTP server, webspace or somehow to share data, I would be happy to send you the 200 Danish exomes in VCF.

ADD COMMENT • link 13.3 years ago by Zev.Kronenberg 12k

1

Entering edit mode

Hi Zev: I am interested in the dataset too. If you have permissions (including IRB and institutional approval), it will be nice if you can upload the data to a public data repositories like European Nucleotide Archive (http://www.ebi.ac.uk/ena/data/search/) or similar resources and share the URL here.

ADD REPLY • link 13.3 years ago by Khader Shameer 18k

0

Entering edit mode

Hi Zev -- I'd love to have access to the 200 Danish exomes as well, would be glad to provide more details.

ADD REPLY • link 13.3 years ago by Deniz ▴ 140

0

Entering edit mode

I believe I don't need IRB since I generated the VCF calls from raw reads that were publicly available?

ADD REPLY • link 13.3 years ago by Zev.Kronenberg 12k

0

Entering edit mode

Awesome, if ENA is not appropriate for VCF submissions, you may also try Dryad data repositiory http://datadryad.org/.

ADD REPLY • link 13.3 years ago by Khader Shameer 18k

0

Entering edit mode

Hi Zev, Kevin... what about publishing the Danes VCF's it into http://gigadb.org/ ? I've been told that the Danes VCF's were processed in collaboration with BGI, maybe it makes sense to have it there then.

ADD REPLY • link 13.3 years ago by Roman Valls Guimerà ▴ 620

0

Entering edit mode

Greetings,

I don't want to upload these data to a site where I might be confused as being involved in the project. I have the VCFs all ready to go... any other suggestions?

our pipeline:

BWA SAMTOOLS PICARD - dedup GATK - INDEL realign SAMTOOLS - call variants

ADD REPLY • link 13.3 years ago by Zev.Kronenberg 12k

0

Entering edit mode

Hi Zev, are the VCFs still available? I am interested in obtaining a copy. my email is ashkot[at]hotmail.com

ADD REPLY • link 10.3 years ago by win ▴ 990

0

Entering edit mode

Hi Zev -- I'd love to have access to the 200 Danish exomes as well, could you send me? eyupsvs@hotmail.com

ADD REPLY • link 7.3 years ago by eyupuctepe • 0

0

Entering edit mode

can you share the file via OneDrive

ADD REPLY • link 6.8 years ago by tesmai4 • 0

score 3 · Answer 2 · 2011-12-12

The SNPs for those genomes are available for download at the UCSC under the name "pg*" you could generate those VCF files using awk. Something like:

$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/pgVenter.txt.gz" |\
gunzip -c |\
awk 'BEGIN { printf("#CHROM\tPOS\tID\tREF\tALT\n");} { printf("%s\t%d\t.\t.\t%s\n",$2,1+int($3),$5);}'

#CHROM  POS ID  REF ALT
chr1    65745   .   .   G
chr1    65797   .   .   C
chr1    65872   .   .   G
chr1    66008   .   .   G
chr1    66162   .   .   T
chr1    66258   .   .   G
chr1    66275   .   .   T
chr1    66294   .   .   TA/AT
chr1    66312   .   .   T
chr1    566139  .   .   A/C
(...)

score 2 · Answer 3 · 2011-12-12

VCF is a very flexible format & I would be careful converting Complete Genomics directly into VCF on your own -- for example Complete handles complex variants very differently compared to how 1000G handled them in the Pilot phase. Digging into the supplemental information on the Korean genome publication etc. can help fill some of those extra fields.

Also, the genomes you've mentioned contain Structural Variation data of various degrees of completeness -- and VCF files do exist for these kinds of variants as well.

By VCF file, do you mean you're interested in the format itself, or a particular kind of variant?

score 2 · Answer 4 · 2011-12-12

2

Entering edit mode

13.3 years ago

Gustavo ▴ 530

Depending on what you're trying to do, you might find Kaviar useful: --> http://db.systemsbiology.net/kaviar/

ADD COMMENT • link 13.3 years ago by Gustavo ▴ 530

Ram · Answer 5 · 2011-12-15

1

Entering edit mode

13.3 years ago

Zev.Kronenberg 12k

Greetings again. I have loaded the Danish Exomes to a Dropbox. Shoot me an email to get the goods :-).

ADD COMMENT • link 13.3 years ago by Zev.Kronenberg 12k

0

Entering edit mode

Hi could you share them with raygoza4@gmail.com. Regards

ADD REPLY • link 13.3 years ago by Raygozak ★ 1.4k

0

Entering edit mode

This is great, could you also share with sptaylorUCLA@gmail.com? Thanks

ADD REPLY • link 13.3 years ago by Sptaylor ▴ 120

0

Entering edit mode

Thanks! Zev! sorry for the late reply. somehow the email notifications didn't work and I hadn't realised my question was replied. I was actually more interested in WGS VCFs thanks for your offer though. Will keep in mind when I am doing exome next (soon!)

ADD REPLY • link 13.2 years ago by Kevin ▴ 640

0

Entering edit mode

@zev.kronenberg thanks Zev, my email is denizkural --@T-- gmail --D0T-- com I likewise finally returned to the thread, if a bit late.

ADD REPLY • link 13.2 years ago by Deniz ▴ 140

0

Entering edit mode

Hi, if you still have it could you also share them with me? avpostma@gmail.com. many thanks.

ADD REPLY • link 12.8 years ago by avpostma • 0

0

Entering edit mode

Hi Zev,

I'm also very interested on the 200 Danish exomes in VCF as the SRA access is no longer valid. Could you please share them with coucou90@gmail.com?

Regards
Kirsley

ADD REPLY • link updated 3.2 years ago by Ram 45k • written 10.4 years ago by kc.mailinglist • 0

0

Entering edit mode

Sent the link to your email.

ADD REPLY • link 10.4 years ago by Zev.Kronenberg 12k

0

Entering edit mode

Is it possible to get the data to wowater@yandex.ru? Thanks!

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by Vova Naumov ▴ 220

0

Entering edit mode

Hi Zev - please share the download link with me, too! This is really nice of you to provide, thank you. Email is irene@sequencing.com

ADD REPLY • link 10.0 years ago by Irene@Sequencing.com ▴ 280

0

Entering edit mode

Hi could you share them with piechota.marcin [at] gmail.com. Thanks

ADD REPLY • link updated 3.2 years ago by Ram 45k • written 9.9 years ago by piechota.marcin ▴ 70

0

Entering edit mode

Hi,

I was wondering if the Danish data are still available.

Thank you

ADD REPLY • link 9.0 years ago by francesco.montinaro • 0

0

Entering edit mode

Hi Zev, Picking up quite an old thread. Is the dataset still available for download? email: tesmai4@gmail.com

ADD REPLY • link 6.8 years ago by tesmai4 • 0

Ram · Answer 6 · 2014-06-12

0

Entering edit mode

10.8 years ago

e.cirillo8923 • 0

Dear Zev, I'm also really interested on the 200 Danish exomes in VCF. Is still possible share it?

My email is: elisa.cirillo@maastrichtuniversity.nl

Thank you very much!

ADD COMMENT • link updated 5.2 years ago by Ram 45k • written 10.8 years ago by e.cirillo8923 • 0