I have found the following behavior in bxpython which sounds like a bug to me:
I have a loop where I create a new Interval tree and check how many clusters it contains for the same data. It outputs:
With max_gap: 0 #clusters is: 156
With max_gap: 1000 #clusters is: 151
With max_gap: 2000 #clusters is: 155
With max_gap: 3000 #clusters is: 155
With max_gap: 4000 #clusters is: 156
With max_gap: 5000 #clusters is: 158
Is this possible? I thought that number of clusters should decrease monotonically with the length of cluster_distance?
Just wanted to double check before I report this as a possible bug.
The relevant code is:
max_distance = [i for i in range(0,5001,1000)]
for max_gap in max_distance:
temp_tree = build_cluster_tree('chrX2.map' 10, max_gap)
print "With max_gap:", max_gap, " #clusters is:",len(temp_tree.getregions())
More infor about Bx-pythons clustertrees here: Finding and displaying short reads clustered in the genome
As an aside: list comprehensions are cool and all, but
range()
outputs a list anyway.