Hi,
I have a VCF file (ref.vcf) that states where are exactly the insertions and deletions in my genome. And my indel detection method produces its own VCF file (let's call it test.vcf).
To calculate the True Positives, I need detect the intersection of test.vcf and ref.vcf (I use exact intersection for the sake of simplicity for now). The True Positives, are the features in test.vcf that are also in ref.vcf. It is easy to understand the definition. But how to code it by hand? Or is there some software or package (in R or matlab)can be used to give the the correct results? Thanks.
You want to calculate the intersection of two lists. R has a function named
intersect()
.If you want to classify true/false detection rates, you need to use synthetic data.