I am new to Hi-C data and wondering how to handle zero contact frequency.
I started with normalized Hi-C data and removed NAs, but there are substantial number of zeros. Some of them are true zeros (no contact) but others maybe due to sequencing depth or not crosslinked by chance etc. I wonder how others are treating those zeros.
I am trying to compare contact frequency of a group of paired loci (pairs of interest) against random pairs and see whether my pairs of interests are significantly higher/lower contact frequency than random pairs. Possible options I can think of are: (1) remove all zeros => this would skew random selection (2) include all zeros => potential underestimation of the contact due to technical limitation (3) add very small value to randomly picked zeros?
Any comments on how you handle zeros would be appreciated!