How to determine % similarity between genomes?
2
2
Entering edit mode
3.7 years ago
A_heath ▴ 170

Hi all,

I am aligning multiple bacterial genomes and I would like to know how I can obtained a % of identity between these genomes ?

Is that a function that can be displayed by either Mugsy or Mauve?

Thank you for your answers.

Audrey

mauve mugsy genome-alignment • 1.2k views
ADD COMMENT
5
Entering edit mode
3.7 years ago
5heikki 11k

I recommend Mash

mash dist genome1.fna genome2.fna
ADD COMMENT
0
Entering edit mode

Thank you for your help.

I used mash and I have the following results :

Mygenome.fasta Close_genome_1.fasta 0.0196 0494/1000

Mygenome.fasta Close_genome_2.fasta 0.0174 530/1000

I do not really understand the meaning of the two scores.

In this case, which genome is closer? Genome 1 or 2?

Thanks in advance

ADD REPLY
1
Entering edit mode

Close_genome_2 is closer. ANI = 1 - mash distance, so here 1 - 0.0174 = 0.9826, i.e. 98.26% similarity. The last column displays the number of shared hashes (out of 1,000 by default). You can get more precise results if you sketch your genomes first with e.g. k-mer value of 17 and sketch size of 10,000 (mash sketch -k 17 -s 10000 input.fna) and then compare the resulting .msh files with mash dist

ADD REPLY
3
Entering edit mode
3.7 years ago
Carambakaracho ★ 3.3k

What you're probably looking for is average nucleotide identity (ANI).

This is a tool I ever wanted to test, but now it's not relevant for me anymore

FastANI (publication)

More readings from my simple web search

https://www.sciencedirect.com/science/article/pii/S0580951714000087

https://img.jgi.doe.gov/docs/ANI.pdf

ADD COMMENT

Login before adding your answer.

Traffic: 3000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6