Dear Friends,
I am trying to annotate genes for a phage genome (Salmonella phagevB SPuM_SP116; Accession: KP010413.1) obtained from a blast hit of an assembled contig. Could you please tell me the best way to do it?
I tried DNAmaster but, for some reason it doesn't give me ORF/CDS (predicted by Glimmer/GeneMark - softwares integrated in DNAmaster) and only gives me tRNAs (predicted by Aragorn - software integrated in DNAmaster); may be I am doing something wrong. Whereas, when I run Glimmer locally, it gives me ORFs. I would really appreciate your input on what you think could be wrong here?
And, I am trying to download fasta file of the phage "Salmonella phagevB SPuM_SP116" from the https://phagesdb.org/phages/ website but I do not find this phage there. Can you please let me know if am missing something here?
Thank you very much! DK
I didn't find there Salmonella as a host, neither in the header nor going through rows here: https://phagesdb.org/allphages/
Check the name of the phage. Who submitted the sequence?
I was involved in sequencing of some phages, it's possible to find them in NCBI-nucleotide section.
Hi Natasha,
From NCBI I get this page for the phage https://www.ncbi.nlm.nih.gov/nuccore/KP010413.1/
Submission information is as below:
but, I am clueless on the next steps of what information to use from this page (above), in the pahgesdb website to download this phage? Could you please provide your guidance?
Thank you, DK
Just an edit to my above query:
I went to https://phagesdb.org/data/ and to:
and downloaded the list "With phage names" and looked for the accession of my phage of interest "KP010413" but , could nto find it. Is this the right way to find of the phage is present in the phagesdb website?
Thanks, DK
You could just use pipelines like
prokka
for this. It is a general purpose annotation pipeline, and salmonella is well characterised so you would likely get good annotations straight away.Alternatively, you can tell
prokka
to use different translation tables, and instruct it to use a viral table (though for bacteriophages the bacterial table is probably sufficient anyway).You found this link below, right?
https://www.ncbi.nlm.nih.gov/nuccore/KP010413.1/
On this page you have Genebank-link
in the left hand corner and FASTA-link
below - press it - I have got that genome here
https://www.ncbi.nlm.nih.gov/nuccore/KP010413.1?report=fasta
Why do you insist on that phage-database? NCBI has been a reliable source.
Thanks Natasha! I already have the phage genome from NCBI; since I am using DNAmaster, it is said in its tutorial that one should use the phage genome from their phagesdb website only as it is a "finished/polished" sequence which is important for gene prediction and annotation. But since I don't find the phage in pahgesdb, after downloading from NCBI I performed the DNAmaster steps but when predicting the genes/ORFs I only see tRNAs after running "auto-annotate" but, when I run "Glimmer" locally I find predicted ORFs. So, I am trying to figure out what could be wrong with DNAmaster step? Any sugegstions? I would really appreciate.
See this link again. https://www.ncbi.nlm.nih.gov/nuccore/KP010413.1/
And go down - the authors predicted a lot of reading frames. What else do you need?
Thanks Natasha! I totally agree! But, prediction of genes by softwares like Glimmer, DNAmaster etc might predict some new genes - I think that could be one reason for researchers to use such softwares; do you agree? Moreover, DNAmaster is so widely used for prediction and annotation; however, the software is not stable and crashes often.
DK : First align your assembled genome to the reference that @Natasha has linked below. Make sure there is good concordance between your sequence and reference. At this point you could use the reference genome annotation from NCBI's version to map on to your assembly (assuming there are good stretches of homology, ideally there may just be some SNP's).
I went to Google and typed ‘bacteriophage genome annotation’.
There are a lot of links there, some of them look promising.
Like this one: https://phagesdb.org/media/workflow/protocols/pdfs/Guiding_Principles_of_Bacteriophage_Genome_Annotation_6.2013_PDF.pdf - your favorite db, isn't it?
or this one: http://grantome.com/grant/NSF/DBI-0850356
Try to find some other database, otherwise you have to annotate it by yourself as @genomax has suggested...
There should be a lot of them – phages are the most abundant viruses on the Earth.
They are harmless, their genomes are relatively small. I mean, there may be some other databases
with the phage genome. Or you can send a letter to the authors – they did it about 4 years ago. Good luck!
Thanks! I will work on these ideas and let you know. I think using DNAmaster for the genomes not present in phagesdb website is not a good idea as one needs to polish the sequence for analysis. I am thinking of using Glimmer, GeneMark and then annotating these genes using using BLast and/or use the already annotated genes from NCBI as @genomax said.
Look at genomax comment in the end of this post. A: Shall I take vigna angularis or vigna radiata as a reference for vigna munga
It worked for bacteria, who knows it may be helpful to your phage genome annotation as well with known template?