I have count data describing how many markers are connected with each chromosome position:
- [0,0,0,1,0,0,0,2,0,0,0,1,1,....]
However, I have 3 or even 4 orders of magnitude less number of markers than available positions - so I have a lot of zeros.
- My question is how to find clusters of markers with non-random distribution, e.g. too dense comparing to random positioning?
I have calculated distribution of pair distances between markers and compare it with simulated distances from random distribution, and they are different.
I assume that markers are localize both in random and non-random fashion but I am only interested in non-random clusters.
- Actually I am even looking into similarity of my problem to other bioinformatic approaches in seq analysis (SNP, HMM in CpG island discovery,... ) for some ideas...