Entering edit mode
6 months ago
Vijith
▴
90
I request you for small help with Augustus gene prediction tool. I am using Augustus for annotation of a de novo assembled genome of a plant species.
Reading the Augustus documentation, I see a list of plant species. These model plants and the one I am using belong to distinct orders. So, I am confused as to what species to opt for.
At the same time, data of a transcriptome analysis performed on a closely related genus is available here I would like to know how I can make use of this data for a more accurate gene prediction. Any input is highly appreciated.
Using any of the species present would likely result in a resonable annotation set. You could use Arabidopsis since it likely has the most updated and used annotations of the list, but another approach could be to find the most closely related of the available assemblies.
Hi dthorbur, thanks for the response. I have run Augustus using Arabidopsis as a species. I would like to know about the suitability of using BRAKER 2 by providing a protein database. I have found transcriptome data of three closely related genera, and I can get the protein data from these transcriptome data. By BRAKER 2 documentation, annotation can be performed using the genome and the protein data in fasta format.
I don't have any experience using
BRAKER2
so I cannot comment, but I would be extra careful about data you find on non-model organisms. There is a lot of mediocre data out there, and using poorly annotated transcriptomes can be quite detrimental. I would double check things like estimated completeness, BUSCO, etc...