The GFF3 format is well specified for describing sequence location and type in the first eight columns. The ninth column is left for specifying any remaining information. I would like to use GFF3 the encode the data typically produced by a genome annotator.
How should I go about this? Are there any conventions for encoding information such as protein product, EC number, and description?
Thanks for suggestion. Are there any common variants of GFF which include this type of information?
You mean that have protein information? Not that I can recall, though I can tell you that the GFF3 at NCBI has EC_number tags that violate the GFF3 spec (at least, they did the last time I checked). What is the end use that you have in mind?
Encoding additional genome annotation data in GFF3. Things like product and description.
Yes, but I meant, why do you want to do that? Who is it for, and what will they be using the GFF file for?