Measuring sequence similarity between draft genomes
2
0
Entering edit mode
4.4 years ago
Raghul ▴ 200

Hi

I have 2 draft genomes of closely related bacterial species. I have downloaded the data from NCBI Genome. Their sequencing methods & assembly tool differs. I want to compare these 2 whole genomes & measure the sequence similarity. What are the tools available to do that? I am using windows OS! My first objective is to find how much similar are they, for eg. say 80% or 90% ?

some Details for species 1 Assembly type: na Assembly level: Scaffold Genome representation: full RefSeq assembly and GenBank assembly identical: yes WGS Project

some Details for species 2 Assembly type: na Assembly level: Scaffold Genome representation: full RefSeq assembly and GenBank assembly identical: yes WGS Project Assembly method: Unicycler v. 0.4.7 Expected final version: no Genome coverage: 81.4x Sequencing technology: Illumina HiSeq

Total sequence length 3,827,202 & 3,804,728 (species 1 & 2) Total ungapped length 3,826,102 & 3,789,834

Hope the newly added info above helps, Thank you for all your replies!

Thanx
Raghul

genome_comparison draft_genome sequence_similarity • 1.2k views
ADD COMMENT
0
Entering edit mode

Thanx

Please use professional language on professional/scientific forums, not IM/SMS jargon.

ADD REPLY
0
Entering edit mode

Mensur's suggestions below are a good start. Appropriate methods will somewhat depend on how close you expect the sequences to be.

Are these 2 sequences of different isolates of the same bacteria? Assemblies of the exact same clone? Are they entirely different species?

ADD REPLY
2
Entering edit mode
4.4 years ago
Mensur Dlakic ★ 28k

FastANI and pyani will do alignment-free comparisons of whole genomes.

ADD COMMENT
0
Entering edit mode
4.4 years ago
h.mon 35k

Are the two draft genome different assemblies of the same data set, e.g., using different assemblers, or different sequencing reads filtering? Or are they closely related, but different strains? Please provide more information.

You can use QUAST to access the quality of genome assemblies and compare several assemblies. You can also use Mauve to align the assemblies and visualize the alignment.

ADD COMMENT

Login before adding your answer.

Traffic: 2527 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6