Hi there,
I'm not sure how active this forum is these days, but I'm hoping for some help on blastp results that I cannot wrap my head around. In order to study similarity between protein regions, I generated my own blastp database and blasted the same fasta file against it;
/bin/makeblastdb -in infile.fasta -dbtype prot -out database ;
/bin/blastp -db database -query infile.fasta ... (settings)
Therefore, I was expecting to find a lot of duplicate outcomes, where query versus subject sequences would give the same outcome as vice versa. However, in some cases, these pairs of alignments give slightly different outcomes, E.g:
Q:Seq1 (length 32) S:Seq2 (length 97) 77.778 %id
Q:Seq2 (length 97) S:Seq1 (length 32) 76.471 %id
After further inspection, it seems that the alignments were not extended the same way:
Q1S2: QHWGQGTLLTVSSGES FDLWGRGTLVTVSSGES
Q2S1: YFDLWGRGTLVTVSSGES YFQHWGQGTLLTVSSGES
Is there a logical explanation how these alignments could differ, based on the direction of the comparison? And is there a way to prevent this?
I guess the alignment is scoring different when you open a gap in query that in the target
I don't think so; there is only a single gap penalty to be set, and it is independent of the AA context.