Entering edit mode
6.1 years ago
Paul
▴
80
I have a new bacterial strain which I assembled using SPADES and scaffolded it using SPADES. Now to annotate the scaffolds, I used RAST. RAST has given me 8000 genes. I tried to annotate the genes using RAST. Out of 8000 genes, only 1497 genes got annotated using RAST.
Could you please suggest me a way to annotate the whole 8000 genes.
How big is your genome? 8000 seems like a high number for a bacteria. Chances are most of those predicted CDSs are junk.
Try annotation with Prokka and see if you get comparable results.
How many contigs do you have? Did you check the overall assembly quality (e.g. with QUAST) and/or degree of potential contamination (you should always check ALL your assemblies with checkm)?
I agree with jrj.healy that you should try prokka. It's the by far best automatic annotation program I have seen yet.
Leaving potential quality considerations aside (8000 genes DOES sound a lot though), keep in mind that you can't expect a 100% annotation rate even in the best case. You will frequently end up with more than 50% unannotated "hypothetical genes". And the more distant related your new Isolate is to any known reference genomes (--> The "newer" the organism seems to be), the less genes can be successfully annotated based on references.