Hello everyone,
I have a hmmscan output in hmmscan3-domtab format with more then 10000 queries.
I want to filter each profile hit based on hit_coverage (hmm profile coverage).
I was scanning queryresult--> Hits -->Hsps --> HSPfragment this iterations and calculating hit_coverage then writing to another file (filtered file) using using SearchIO in biopython write method but while writing I found that all the hits were going to filtered file.
Below is the code I have used:
for qresult in SearchIO.parse(file_path,'hmmscan3-domtab'):
pass
for each_hit in qresult:
pass
for each_hsp in each_hit:
pass
if re.match("^gfam",each_hit.id) and (100*(each_hsp.hit_end - each_hsp.hit_start)/each_hit.seq_len) >= 75:
SearchIO.write(qresult,filtered_file_handler,'hmmscan3-domtab')
elif not re.match("^gfam",each_hit.id) and (100*(each_hsp.hit_end - each_hsp.hit_start)/each_hit.seq_len) >= 50:
SearchIO.write(qresult,filtered_file_handler,'hmmscan3-domtab')
Any suggestions for sending only filtered hits to the filtered file.
PS - I have noticed that I am writing qresult. I need some suggestions for refining on this.
Thanks,
Vijay N