Hi,
I have a VCF file (which contains indels as well as SNPs) obtained from an individual's DNA (from a whole genome microarray) with respect to the reference genome. Is there any tool out there the will produce a UCSC chain file wrt to the reference genome given the VCF file?
hu ? a chain file map the regions between two assemblies. How would you use a VCF file to build this chain file ?
The indels in the VCF file are in essence how the 'assembled' genome of the individual differs from the reference assembly, in coordinates. So given these indels, and hence a chain file, a personal genome can be created.
A VCF file is a list of local differences, many of which may be inexact, especially longer variations. It is very unlikely that they would contain information at sufficient accuracy to create an entire assembly. These files are not designed to correct for cumulative errors, yet when you remap intervals a single error can affect all subsequent coordinates.
That may be true or not depending on what data the VCF was generated from.
I'm just asking whether anyone knows of a script/tool out there that takes a VCF (produced for a single individual sample) and creates a chain file based on the indels in the VCF (the homozygous indels if we are to be acurate), to save me writing that myself. That's all.
There are two different questions here really.
I don't actually know the answer to either of these. But my guess would be a no.
But I'd be happy to learn more on what actually happens in practice.