Are parent features lines always appear before sub-feature lines in GFF3 files?
1
0
Entering edit mode
7.6 years ago
bill-zt ▴ 50

In GFF3 file, there are parent features lines and sub-feature lines like:

I   SGD gene    335 649 .   +   .   ID=gene:YAL069W;biotype=protein_coding; 
I   SGD mRNA    335 649 .   +   .   ID=transcript:YAL069W;Parent=gene:YAL069W;biotype=protein_coding

Usually, parent features lines (such as gene) should appear in front of sub-feature lines (such as mRNA). I'm not sure whether it is a custom or a strict rule. In the GFF3 Specifications document (https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md), there is no such rules.

GFF3 genome annotation • 1.4k views
ADD COMMENT
1
Entering edit mode
7.6 years ago
Michael 55k

As to my knowledge, there is no such requirement in the format definition of GFF itself anywhere. However, when you want to import GFF3 features into a database, e.g. using the CHADO schema, in an efficient way, the features need to be sorted like this. There is a perl script in the CHADO package, that sorts and groups gene models by position and parent relationship: gmod_gff3_preprocessor.pl

ADD COMMENT

Login before adding your answer.

Traffic: 1836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6