Where Can I Download Vcf Files For Publicly Available Data?
6
6
Entering edit mode
13.0 years ago
Kevin ▴ 640

I have the 1000 genomes VCF, but I am wondering if there are VCF files avail for other genomes like

  1. Korean genomes
  2. African genomes
  3. Venter
  4. Watson

Cheers

vcf snp • 20k views
ADD COMMENT
3
Entering edit mode
13.0 years ago

Complete genomics has some publicly available datasets. I am sure there is a converter to VCF. If you have an FTP server, webspace or somehow to share data, I would be happy to send you the 200 Danish exomes in VCF.

ADD COMMENT
1
Entering edit mode

Hi Zev: I am interested in the dataset too. If you have permissions (including IRB and institutional approval), it will be nice if you can upload the data to a public data repositories like European Nucleotide Archive (http://www.ebi.ac.uk/ena/data/search/) or similar resources and share the URL here.

ADD REPLY
0
Entering edit mode

Hi Zev -- I'd love to have access to the 200 Danish exomes as well, would be glad to provide more details.

ADD REPLY
0
Entering edit mode

I believe I don't need IRB since I generated the VCF calls from raw reads that were publicly available?

ADD REPLY
0
Entering edit mode

Awesome, if ENA is not appropriate for VCF submissions, you may also try Dryad data repositiory http://datadryad.org/.

ADD REPLY
0
Entering edit mode

Hi Zev, Kevin... what about publishing the Danes VCF's it into http://gigadb.org/ ? I've been told that the Danes VCF's were processed in collaboration with BGI, maybe it makes sense to have it there then.

ADD REPLY
0
Entering edit mode

Greetings,

I don't want to upload these data to a site where I might be confused as being involved in the project. I have the VCFs all ready to go... any other suggestions?

our pipeline:

BWA SAMTOOLS PICARD - dedup GATK - INDEL realign SAMTOOLS - call variants

ADD REPLY
0
Entering edit mode

Hi Zev, are the VCFs still available? I am interested in obtaining a copy. my email is ashkot[at]hotmail.com

ADD REPLY
0
Entering edit mode

Hi Zev -- I'd love to have access to the 200 Danish exomes as well, could you send me? eyupsvs@hotmail.com

ADD REPLY
0
Entering edit mode

can you share the file via OneDrive

ADD REPLY
3
Entering edit mode
13.0 years ago

The SNPs for those genomes are available for download at the UCSC under the name "pg*" you could generate those VCF files using awk. Something like:

$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/pgVenter.txt.gz" |\
gunzip -c |\
awk 'BEGIN { printf("#CHROM\tPOS\tID\tREF\tALT\n");} { printf("%s\t%d\t.\t.\t%s\n",$2,1+int($3),$5);}'

#CHROM  POS ID  REF ALT
chr1    65745   .   .   G
chr1    65797   .   .   C
chr1    65872   .   .   G
chr1    66008   .   .   G
chr1    66162   .   .   T
chr1    66258   .   .   G
chr1    66275   .   .   T
chr1    66294   .   .   TA/AT
chr1    66312   .   .   T
chr1    566139  .   .   A/C
(...)
ADD COMMENT
2
Entering edit mode
13.0 years ago
Deniz ▴ 140

VCF is a very flexible format & I would be careful converting Complete Genomics directly into VCF on your own -- for example Complete handles complex variants very differently compared to how 1000G handled them in the Pilot phase. Digging into the supplemental information on the Korean genome publication etc. can help fill some of those extra fields.

Also, the genomes you've mentioned contain Structural Variation data of various degrees of completeness -- and VCF files do exist for these kinds of variants as well.

By VCF file, do you mean you're interested in the format itself, or a particular kind of variant?

ADD COMMENT
2
Entering edit mode
13.0 years ago
Gustavo ▴ 530

Depending on what you're trying to do, you might find Kaviar useful: --> http://db.systemsbiology.net/kaviar/

ADD COMMENT
1
Entering edit mode
12.9 years ago

Greetings again. I have loaded the Danish Exomes to a Dropbox. Shoot me an email to get the goods :-).

ADD COMMENT
0
Entering edit mode

Hi could you share them with raygoza4@gmail.com. Regards

ADD REPLY
0
Entering edit mode

This is great, could you also share with sptaylorUCLA@gmail.com? Thanks

ADD REPLY
0
Entering edit mode

Thanks! Zev! sorry for the late reply. somehow the email notifications didn't work and I hadn't realised my question was replied. I was actually more interested in WGS VCFs thanks for your offer though. Will keep in mind when I am doing exome next (soon!)

ADD REPLY
0
Entering edit mode

@zev.kronenberg thanks Zev, my email is denizkural --@T-- gmail --D0T-- com I likewise finally returned to the thread, if a bit late.

ADD REPLY
0
Entering edit mode

Hi, if you still have it could you also share them with me? avpostma@gmail.com. many thanks.

ADD REPLY
0
Entering edit mode

Hi Zev,

I'm also very interested on the 200 Danish exomes in VCF as the SRA access is no longer valid. Could you please share them with coucou90@gmail.com?

Regards
Kirsley

ADD REPLY
0
Entering edit mode

Sent the link to your email.

ADD REPLY
0
Entering edit mode

Is it possible to get the data to wowater@yandex.ru? Thanks!

ADD REPLY
0
Entering edit mode

Hi Zev - please share the download link with me, too! This is really nice of you to provide, thank you. Email is irene@sequencing.com

ADD REPLY
0
Entering edit mode

Hi could you share them with piechota.marcin [at] gmail.com. Thanks

ADD REPLY
0
Entering edit mode

Hi,

I was wondering if the Danish data are still available.

Thank you

ADD REPLY
0
Entering edit mode

Hi Zev, Picking up quite an old thread. Is the dataset still available for download? email: tesmai4@gmail.com

ADD REPLY
0
Entering edit mode
10.5 years ago

Dear Zev, I'm also really interested on the 200 Danish exomes in VCF. Is still possible share it?

My email is: elisa.cirillo@maastrichtuniversity.nl

Thank you very much!

ADD COMMENT

Login before adding your answer.

Traffic: 2078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6