Database Of Human Structural Variants Enabling Vcf File Creation
2
1
Entering edit mode
12.4 years ago
Travis ★ 2.8k

Hi all,

I am trying to get hold of or create a VCF format file containing known, well validated human structural variants in order to introduce them into a reference genome. The end goal is to assess a range of SV detection tools using an unbiased, mutated dataset. Is anyone aware of a human SV database that enables download of this information in VCF format or in a format easily converted to VCF? I have done some searching but the only databases I have found don't appear to provide the 'reference' and 'mutant' sequence that would enable me to recreate the mutation in silico.

Thanks in advance.

next-gen vcf sv • 4.0k views
ADD COMMENT
1
Entering edit mode
12.4 years ago
deanna.church ★ 1.1k

dbVar (http://www.ncbi.nlm.nih.gov/dbvar) is a database of structural variants and provides FTP files in gvf format. You can get data by organisms/assembly or by organism/study: ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/

ADD COMMENT
0
Entering edit mode
12.4 years ago
matted 7.8k

I'm not positive what exactly you need, but have you looked at the 1000 Genomes structural variation dataset?

"The pilot paper data directory contains vcf files for different types of structural variants both for the low coverage and trio pilot studies"

Data here: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/pilot_data/paper_data_sets/a_map_of_human_variation/

It looks like things are categorized by sequencing strategy and variant type, which might be what you need. They're already in VCF, as well.

ADD COMMENT
0
Entering edit mode

Apologies - I have edited the question to include "The end goal is to assess a range of SV detection tools using an unbiased, mutated dataset". Since the 1000G VCFs are largely unvalidated and based on some of the software I would like to test, it doesn't satisfy the well-validated/unbiased criteria. Furthermore, the 1000 genomes data you linked to consists of SNVs and small Indels only - I am specifically interested in larger structural variants.

ADD REPLY

Login before adding your answer.

Traffic: 1928 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6