Used a little different approach, which also seems to be working.
input = [
(24679461, 24680333),
(24679455, 24680312),
(24679455, 24680333),
(24679455, 24680333),
(24679455, 24680333),
(24679455, 24680312),
(24679464, 24680333),
(24679452, 24680333),
(24679464, 24680339),
(24679461, 24680333),
(24679461, 24680333),
(24679449, 24680333),
(24679461, 24680333),
(24679464, 24680333),
(24679452, 24680333),
(35167152, 35167547),
(35167209, 35167547),
]
delta = 100
output = {}
# 24679449 - 24680333
# 35167152 - 35167547
for r in input:
start = r[0]
end = r[1]
# if already have a range with the same start, just extend if new range is longer
if start in output:
print(f"Extending range: start={start}, end={max(output[start], end)}")
output[start] = max(output[start], end)
else:
merged = False
# if we don't have a range starting at the same location, look for a range that is 'delta' distance away in either direction
for i in range(start - delta, start + delta):
if i in output:
# if we found a range with lower starting position, extend it if the new one is longer
if i < start:
print(f"Extending upper range: start={i}, end={max(output[i], end)}")
output[i] = max(output[i], end)
else:
# otherwise if we found a range with higher starting position we need to replace it with the starting position of the new one
# and keep the "longer" end (and also delete the old range which has now been "consumed")
print(f"Extending lower and upper range: start={start}, end={max(output[i], end)}")
output[start] = max(output[i], end)
del output[i]
merged = True
break
# if we haven't merged the new range with anything just save it
if not merged:
print(f"Adding new range: start={start}, end={end}")
output[start] = end
print(output)
An FYI..I don't know if Biostars messed it up; however your code doesn't work as posted. Your
for r in input:
for loop isn't indented correctly. And even when that is fixed, the output printed is just the last range.Might be fine on your end and this was all caused by formatting.
I think it should be working now.
I see it was definitely mis-indented earlier. Thought it was worth pointing out as I didn't want you to come back here later and find it not working.