I am testing structural variants (deletions, duplications, inversions, insertions) calling tools and need a standard dataset to validate my calls.
Where can I find benchmark bam and corresponding vcf file for specific individual?
1000Genomes has calls for all the SV types only in the pilot release. But I am not able to find out which bam file (WGS or WES) should I use to call SV's and which vcf file should I use to check my calls.
Any help with an example will be appreciated.
And bam files should be downloaded from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/data/ ? If so, why bam file size differ (for example: low coverage Illumina bam file size varies from 9 to 24Gb)?