Part of my research interest is to look for unusual conservation patterns in pairwise genome comparisons. Naturally, such patterns might suggest functional elements, or occasionally, lateral DNA transmission between organisms.
I have so far relied on ad-hoc rules to extract these. However, the same rule does not apply to a different species pair - for example, some would find a 200bp perfect match between chicken-human unusual, but it is a lot less surprising for chimp-human comparison.
My question here is whether there is a statistical method to measure the "surprise" of an alignment, given the expected sequence divergence between two species? I realize there could be complications as selective pressure varies for different types of sequences - but would like to see what others approach this problem.
thanks for the pointer. According to the paper, "We identify HCNEs by scanning pairwise BLASTZ net whole-genome alignments (nets) downloaded from the UCSC Genome Browser database for regions with at least I identities over C alignment columns." - this is an arbitrary rule that I mentioned, which I hope that there is a better alternative to.