Interpret genome alignment results
1
0
Entering edit mode
5.1 years ago
el97004 ▴ 80

Hi all!

I assembled two different genomes and wanted to see how similar they are on both nucleotide and protein levels so I aligned their nucleotide and translated nucleotide sequences. Here are the results I obtained:

Nucleotide identity=90% Protein identity=57%

How would one make sense of this high nucleotide yet low protein identity result? I have been doing a lot of reading and it seems that if the species are close its better to use the DNA sequence to compare, and I believe these two species should be fairly close. However, I am still confused as to why the values would differ so much.

Thanks for your input!

alignment protein nucleotide • 1.4k views
ADD COMMENT
0
Entering edit mode

There are lots of reasons for this, and all else being equal this is to be expected.

You need to clarify whether these are DNA sequences of genes or the whole genome etc.

ADD REPLY
0
Entering edit mode

Sorry I should have clarified. Whole genomes!

ADD REPLY
1
Entering edit mode

It doesn't make any sense to translate the whole genome, and consequently even less to align/compare them.

ADD REPLY
1
Entering edit mode

Exactly!! Only translate and compare protein-coding regions. For non-coding regions, DNA similarity can be high but when ERRONEOUSLY translated, the "protein" sequences could be from different frames and therefore very low similarity. Again, only translate and compare protein-coding regions.

ADD REPLY
1
Entering edit mode
5.1 years ago
michael.ante ★ 3.9k

Hi,

Little changes on nucleotide level can lead to drastic changes on protein level. In a worst case scenario, you might introduce a frame shift with a mutation in a gene's 5' region which lead to a totally different products. You'll have in such a case nearly 100%identity on nucleotide level but nearly none for the protein.

Depending on your species you have more or less "junk DNA" intergenic region, introns, etc. These non-coding regions can increase the overall nucleotide identity, but not that of the proteins.

Cheers,

Michael

ADD COMMENT
0
Entering edit mode

Thank you that makes sense. But how about in less extreme case scenarios, for example if the third codon in the DNA is mutated it could have no affect on the protein sequence

ADD REPLY

Login before adding your answer.

Traffic: 2729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6