I have a list of 20 QTL from an experiment involving a certain phenotype related to bone development. Of all the candidate genes within the 20 QTL intervals, 5 are associated with a certain disease in humans. I tested the set of 20 QTL to see if they are enriched for the disease-associating genes. Each of the 20 QTL has a certain length in base pairs. I took that set of 20 lengths, and randomly placed them in the mouse genome. I forced them not to overlap and to land no closer than 500,000 bases on either side. Then I took note of the genes inside each interval and counted how many matched the names contained in the disease-associating list of genes. I repeated the process 10,000 times. This provides the expectations given randomness... how many times can I expect to see disease-associating genes inside a set of 20 randomly placed QTL? I compare that to my actual observed number. then I calculate the probability.
To do that, I simply counted how many times 1 disease-associating gene appeared within each set of 20 randomly placed QTL and divide by 10,000. That gives me the frequency. I did that for 2 disease-associating genes, and 3, and so on up to 5, which is my observed value. I find that the frequency of 5's in the random data is less than 0.05. Is this appropriate?
Also, I see online some might suggest a parametric method, like Fishers' exact test, but that is supposed to be used for very small data sets, am I correct? Am I missing something there? How can I implement a parametric calculation for p?
Even that you considered all these situations, there might be still some con-founder might be missed, right? That means, in the past decades, large number of false-positive enrichment were reported among previous huge bioinformatics and genomics studies?
Unknown confounders can always exist. There's no way around that in any experimental science.