Hello everyone,
I am working on assembling a phage genome assembly and for few of my phages, I got nice one contig. However, I am facing some issues in determining whether it is fully closed. Here’s what I have done so far:
Assembly: The genome is in a single contig.
Termini Analysis: I ran PhageTerm, but it could not predict the termini region.
Completeness Check: I used CheckV, and it reports:
Phage 1: 97.95% complete
Phage 2: 99.78% complete
Since the genome is already in one contig and CheckV reports high completeness, I am wondering what additional steps I should take to confirm and finalize the complete genome.
Questions: What alternative methods can I use to identify the phage termini if PhageTerm does not work?
How can I verify if my assembly is truly closed?
Any recommended tools for further validating the genome closure?
Any suggestions or guidance would be greatly appreciated! Thanks in advance.
Have you tried to align your contig to available phage genomes (use a subset if you know what phage you are working with)? As long as you are not working with previously unknown phages you should get good hits.
I have tried blast
. I got few reference genomes which are aligned to my phages. But I am kind of confused about teh next step. How do i proceed from here and get the closed genome based on the reference?
These hits look fairly complete. You may want to reconsider using a global aligner though since blast will primarily look for local "hits". Based on this blast search you could get the RefSeq genome(s) for the "hits" and try doing a pair-wise alignment with your contig to see how things look. You may also want to do multiple sequence alignments with your samples to see what the population of genomes you have looks like.
While you are doing all this, keep in mind that the genomes you have may not be identical to what is in RefSeq/GenBank. Your genomes may have SNV's and/or larger changes (insertions/deletions).