Hi @ll!
I am working with a recently sequenced genome of a non-model system, which is this one:
https://www.ncbi.nlm.nih.gov/assembly/GCA_004329575.1/
The assembly that I can download from there includes only a *.GFF file. A *.GTF file is available too, but it is largely empty (the GFF is not). My bioinformatics pipeline specifically requires me to specify a GTF file. It also says, "Note that the GTF file should resemble the Ensembl format.".
Hence I would like to ask you for advice how to best convert this specific GFF file to GTF. I have read that the conversion differs on a case-to-case basis and there is no general works-on-all-gff-files-method.
So I would be very happy if someone experienced here could take a short look at this specific GFF file and give me advice how to best convert it into an useable *.GTF file.
Thank you a lot for your time!
Cheers
Joe
That's a bummer. Unfortunately I do not have the funds available to create a real annotation, so I cannot apply the pipeline that I wanted to (which wants a GTF file as a mandatory requirement). Is there maybe a way to "simulate" an annotation without actual work on the DNA, such as described here: https://galaxyproject.github.io/training-material/topics/genome-annotation/tutorials/genome-annotation/tutorial.html ?
Run Augustus with the hmm model of the closest species of the one you want to investigate. It's the quicker decent way. You must RepeatMask your genome first (you can apply the same here using the repeat library of the closest species available).
There is plenty of way automated / semi-automated to do annotation. You could go funannotate with is available as container.
Here a list of annotation tools.
My main issue is that there is no hmm model of a remotely close species to this snail. Gastropods/molluscs are not vertebrates or mammals, not worms, not insects, not plants and not fungi.