Question

How to identify 2 genes that have highly identical sequence

0

Entering edit mode

4.4 years ago

Ak ▴ 60

Pardon me if I'm asking a rather basic que as I'm still new in bioinformatic.

I'm trying to identify 2 genes from my assembled genome. I have used BLAST+ but because the 2 genes are highly identical (98.2% per identity when aligned both gene sequences), the hits result I get for both genes are exactly the same in terms of the subject position, query position, bitscore, evalue and percentage identity. So I was wondering,

Since their subject position is exactly the same, how can I know the blast result sequence belongs to which gene?

I've used NCBI blast to double confirm and I assumed the one with the higher percentage identity (both gene evalue is 0 and query cover is 100%) is the rightful identity of the blast result sequence. Which brings me to my next que,

How about the sequence of my other gene?

As the remaining BLAST+ hits result I have either the alignment length is too short with only half of my query length, or the per identity is lesser than 90% (because it's an intra-species searches, I supposed it should be higher than 90%).

So hope someone can give me some advice on how can I identify 2 highly identical genes from my genome. Thanks!

gene genome blastn gene identification • 1.1k views

ADD COMMENT • link 4.4 years ago by Ak ▴ 60

0

Entering edit mode

Alright thanks alot for the advice!

ADD REPLY • link 4.4 years ago by Ak ▴ 60

score 1 · Accepted Answer · 2021-03-19

1

Entering edit mode

4.4 years ago

Mensur Dlakic ★ 29k

For two genes that share that kind of similarity, it almost doesn't matter which one you pick. It may help you if you translate your gene into protein and use that as a query. It wouldn't surprise me if the match ends up being 100% for both hits, in which case it literally wouldn't matter which one you pick.