gff3 header delimiter space or a tab
2
1
Entering edit mode
5.1 years ago
microfuge ★ 1.9k

Dear All,

I could not find a source which states the field delimiter to be used in gff3 header. Can it be a space or a tab or it should be a space only ? My hunch is a space.

##gff-version 3 
##sequence-region 1 10

Many Thanks!

gff3 • 2.7k views
ADD COMMENT
2
Entering edit mode

I don't think it even matters.

you could op en it in vi and then do :set list to show all 'invisible' chars ( ^I is tab )

ADD REPLY
0
Entering edit mode

Thanks so much! This was a fake gff entry I created, just wanted to know if the official specification says something about it. Did not know about the set list option in vi (quite nice :) ).

ADD REPLY
0
Entering edit mode

Link to "official" (best I've found so far) specifications: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md

ADD REPLY
1
Entering edit mode
5.1 years ago
Carambakaracho ★ 3.3k

This is a great question, and one why I 'love' the gff format so much. It is not explicitly defined. Period. See definition of directives in gff3 format - implicitly the documentation uses spaces, just as ATpoint illustrated, so I recommend spaces, too. Tabs are usually used for separation of the feature lines.

ADD COMMENT
2
Entering edit mode

+1 for the space.
As you can see in the snapshots of the different versions of the format I put in the review of the format here: https://github.com/NBISweden/GAAS/blob/master/annotation/CheatSheet/gxf.md they always have used a space.
Let's ask them to clarify it in the repo of the gff3 specification. ✅ => https://github.com/The-Sequence-Ontology/Specifications/issues/23

ADD REPLY
0
Entering edit mode
5.1 years ago
shoujun.gu ▴ 350

gff3 from gencode is tab.

edit: sorry, I didn't notice the post is talk about the header... Then it just regular sentences I think.

ADD COMMENT
0
Entering edit mode

No, it isn't, it is space, at least in the mouse (v20) files I have on my machine.

gzcat gencode.vM20.annotation.gff3.gz | head
##gff-version 3
#description: evidence-based annotation of the mouse genome (GRCm38), version M20 (Ensembl 95)
#provider: GENCODE
#contact: gencode-help@ebi.ac.uk
#format: gff3
#date: 2018-11-30
##sequence-region chr1 1 195471971
ADD REPLY
0
Entering edit mode

Yes, I just realize the post is deal with the header only.

ADD REPLY
0
Entering edit mode

I guess the header line is simply more or less non-standardized at all, but for the actual file, yes it is tab, like in most bioinformatics formats.

ADD REPLY

Login before adding your answer.

Traffic: 1649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6