Hi there!
I'm trying to code a function that reads a bigwig file and calculate the mean score of a given genomic interval. This is my current function:
from ngslib import BigWigFile
def bigwig_mean(bigwig, chr, start, end):
bw=BigWigFile(bigwig)
score_sum = 0
mean_score = 0
for I in bw.fetch(chrom=chr,start=start,stop=end):
score_sum += i.score
if (end-start) != 0:
mean_score = score_sum/(end-start)
else:
mean_score = 0
return mean_score
bw.close()
This function works, but each time that it reads the bigwig file with:
for I in bw.fetch(chrom=chr,start=start,stop=end)
Memory Ram is permanently used, and not released after bw.close()
. As I need to use this function for a large set of genomic intervals, I get out of memory before finishing the task. There is a memory leak in this function and would be great if any of you can tell me what's wrong or give me an alternative idea of how to code this function.
Cheers,
===UPDATE===
Here I found the original documentation of ngslib: https://ngslib.appspot.com/BigWigFile
I am not sure how the fetch method is implemented, but if it is a generator then you could yield from it. If it is not, then you can modify it to make it so using yield. Look here for some info http://stackoverflow.com/questions/9708902/in-practice-what-are-the-main-uses-for-the-new-yield-from-syntax-in-python-3#9709131