Ryan Mills provided us with the following link where the vcf files can be downloaded: ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/phase1/analysis_results/integrated_call_sets/ . These vcf files are used to compare our calls against. We studied one vcf file for chromosome 20, i.e. ALL.chr20.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz
. Deletions with the genotype 0|1
, 1|0
, or 1|1
were extracted for the samples NA19311, NA19312,NA19313, NA19316, NA19317. We got 32 deletions with exact breakpoints and 9 deletions with imprecise breakpoints. Is it ok to compare the calls our tools made against deletions with imprecise breakpoints? How is such comparison performed? Can one assess it with reciprocal overlap (RO)?