I am editing a parser that was written quick and dirty for turning an .sqn file into a GFF3 file. The main thing I was doing was adding support for introns and exons. I then noticed that if a tRNA that codes for the same amino acid with the same anti-codon appears twice it will get the same name and ID. If they are on different strands the validator I was using would through up an error. This made me suspicious that I was doing things wrong.
Should they all be called: ID=trnI(gau);name=trnI(gau)
or should I have ID=trnI(gau)01;name=trnI(gau)01
, ID=trnI(gau)02;name=trnI(gau)02
, ...
or something else?
I know that it doesn't meet my need now because when I import the GFF3 file into geneious it combines them into one annotation. However I will eventually be submitting to GenBank so I don't want weirdly formatted tRNA annotation names.
That is helpful, but I am still left wondering how to implement this. I guess the names could be the same, but I have to have the IDs be unique. I was just wondering if there is a standard way of labeling the IDs in a GFF3 file so that they are all unique
I never heard about specific standard for that purpose. I think you can follow your instinct about what is the best or take inspiration about how it's done by others (ENSEMBL?).
Usually what I'm doing is just give a name like "tRNA-1" with a value starting to 1 that is incremented for every tRNA. I don't take in account the type of tRNA but you can do it.