GFF3 seems to be the standard for representing genome annotations. GFF3 has a clearly defined and specified format. GFF3 is also produced by some genome annotation software. The GenBank submission process however requires a unique data format for annotation submission.
To get a definitive answer you'd have to ask the NCBI, but I suspect that the reasons are a) history and b) cost. Their format has been used for a long time, predating GFF3 (and GFF2) by many years. From Genbank's perspective there's no compelling reason to change or to support two formats where they have one that works. Creating new submission tools to support the potentially complex mapping(s) of GFF3 to Genbank and then troubleshooting the submissions would require extra developer and support staff time.
Historical legacy is a large part of why NCBI looks rather old-fashioned to our younger bioinformaticians. Providing things that we now expect, such as REST APIs, would require complete re-engineering of their entire, extensive infrastructure.
well to be fair gff3 is kinda screwy - virtually everything of interest gets shoved into the ninth column. I'm not sure it's the heir apparent to Sequin/GenBank.
Historical legacy is a large part of why NCBI looks rather old-fashioned to our younger bioinformaticians. Providing things that we now expect, such as REST APIs, would require complete re-engineering of their entire, extensive infrastructure.
Thank you. I wondered if there was a specific reason that this format is used. Too expensive to change is a good reason.