I would like to classify a set of query genomic ranges against a set of truth genomic ranges given a minimum overlap rule of more than half the intersection/union. I call this overlap over the threshold a successful overlap.
If the query range, eg. Q1, successfully overlaps one of the truth ranges (eg. T12), I classify Q1 as True Positive. If it doesn't, I classify it as False Positive.
But I am considering how to classify a case where two query ranges, eg. Q1 and Q2, both successfully overlap the same truth range, eg. T3:
Example:
T3 |----------------------------|
Q1 |-------------------------|
Q2 |--------------------------|
How would people classify Q1 and Q2? Both as True Positives? One as True Positive and the other as False Positive?
That entirely depends on your question. You can define any number of rules with or without biological backing, but even with a biological basis, there will be gray areas.
I think even if you gave more info--including the exact problem you're trying to address, there would be no single, clear answer.
It depends on what question you're trying to answer. Perhaps Q2 would be a True Positive and Q1 a False Positive on the basis of Q2's overlap with T3 being longer in extent than Q1's. Or perhaps you are just categorizing overlaps with T3 above some threshold, which would label Q1 and Q2 as True Positives.