Question

Indentical proteins but different Blast results

0

Entering edit mode

7.6 years ago

paul.hrab05 • 0

Hi, everyone

I've done Blast alignment for one protein against the whole genome. The results are confusing because there are identical alignments with different results. One of the examples: https://ibb.co/kZ6T9Q I've used Biolinux and Blast+ 2.6.1 Do you have any ideas why that have happened?

Blast alignment • 1.9k views

ADD COMMENT • link updated 7.6 years ago by Ahill ★ 2.0k • written 7.6 years ago by paul.hrab05 • 0

0

Entering edit mode

Can you give an example of the query sequence?

ADD REPLY • link 7.6 years ago by ATpoint 87k

0

Entering edit mode

I hope I've understood you :

SCO2792 MSHDSTAAPEAAARKLSGRRRKEIVAVLLFSGGPIFESSIPLSVFGIDRQDAGVPRYRLL VCAGEDGPLRTTGGLELTAPQGLEAISRAGTVVVPAWRSITSPPPEEALDALRRAHEEGA RIVGLCTGAFVLAAAGLLDGRPATTHWMYAPTLAKRYPSVHVDPRELFVDDGDVLTSAGT AAGIDLCLHIVRTDHGNEAAGALARRLVVPPRRSGGQERYLDRSLPEEIGADPLAEVVAW ALEHLHEQFDVETLAARAYMSRRTFDRRFRSLTGSAPLQWLITQRVLQAQRLLETSDYSV DEVAGRCGFRSPVALRGHFRRQLGSSPAAYRAAYRARRPQGDRQPDPDTAAAGATRPLPP SDPPASLAPENAVPFQTRRTATPMPAGAASVPGQRSAP*

ADD REPLY • link 7.6 years ago by paul.hrab05 • 0

0

Entering edit mode

Is the protein in question SCO2792? And what genome are you using?

ADD REPLY • link 7.6 years ago by pfs ▴ 280

0

Entering edit mode

What do those columns mean? Is this one of the standard blast output formats?

ADD REPLY • link 7.6 years ago by GenoMax 150k

0

Entering edit mode

ADD REPLY • link 7.6 years ago by paul.hrab05 • 0

0

Entering edit mode

As Ahill says, you get multiple hits to the same thing because there are 2 high scoring subsections (maybe active sites or conserved domains) within the protein. This is because it's a local alignment tool, so it will always find the highest scoring continuous stretches of a sequence.

Part of the output is the hit start and end positions, as well as the query start and end positions. This tells you which stretches of your query sequences are matching to which stretches of the resulting matched sequence.

ADD REPLY • link 7.6 years ago by Joe 22k

score 2 · Answer 1 · 2017-09-26

BLAST is a local alignment method - it can and will give multiple alignments between a single query and target pair, depending on how run. From quick look - I'd guess those are two different sub-sequences of your query (SCO2792) aligning to different sub-sequences of the target (SCO0697), with high alignment scores. The 2 local alignments you show between the query and target have different start and end coordinates, lengths, expectation scores, etc. that are shown in your table. Use the column headers you list above in the comments to read off the start and end coordinates of each of the two alignments in the query and target.

score 0 · Answer 2 · 2017-09-26

0

Entering edit mode

7.6 years ago

pfs ▴ 280

Any chance you are observing gene duplication? What is the % identity between different alignments?

ADD COMMENT • link 7.6 years ago by pfs ▴ 280

0

Entering edit mode

Can you explain what do you mean by "% identity between different alignments"? If you are talking about an example, they are 46 and 53. And as I understood it couldn't be gene duplication, because there is one sequence for the query and one for the subject. I've checked files for errors and they seem to be good.

P.s I mean input files with sequences - there is only 1 sequence for 1 protein.

ADD REPLY • link 7.6 years ago by paul.hrab05 • 0