Hi I'm using stand alone blastn. I got a strange output with duplicated data. If it were for multiple association with different protein ID it would have not been a problem since it could have had a biological meaning. However, it is not the case. Let me use ProtORG1_1...ProtORG1_N to denote the IDs of the first organisms and ProtORG2_1...ProtORG2_N to denote the IDs of the second one. I have this scenario:
ProtORG1_1 ProtORG2_1 100
ProtORG1_1 ProtORG2_1 0.98
ProtORG1_1 ProtORG2_1 0.80
ProtORG1_1 ProtORG2_1 0.77
ProtORG1_1 ProtORG2_1 0.8
where the third column is the alignment score. How can I eliminate this annoying output since I already used max_target_seqs = 1?
I am using two different isolates of the same parasite I don't know if it can be of any use.
Thank you
How about
Since you have multiple HSP's I am assuming they are all getting listed. Perhaps the parameter above would leave only one.
Are you limiting your blast search to a single HSP with
-max_hsps 1
? If not, that may be what's causing you to get multiple results from the same sequence. Depending on what your ultimate goal for the analysis is, you may actually want to keep multiple HSPs per matching sequence.I have tried -max_hsps 1 but it remains with similar problems even if not so such aggravated