Convertion Of Gff3 To Gtf
3
25
Entering edit mode
12.6 years ago

How do I convert GFF file to a GTF file? Is there any tool available?

gtf gff • 120k views
ADD COMMENT
1
Entering edit mode

It would help to know what your downstream analysis/usage is? If GTF is an intermediate step towards another conversion, I suggest you try to obtain directly the final format. From seqanswers:

The whole point of the GTF format was to standardise certain aspects that are left open in GFF. Hence, there are many different valid ways to encode the same information in a valid GFF format, and any parser or converter needs to be written specifically for the choices the author of the GFF file made. For example, a GTF file requires the gene ID attribute to be called "gene_id", while in GFF files, it may be "ID", "Gene", something different, or completely missing. Hence, a general GFF-to-GTF converter (as opposed to one converting only GFF files from a very specific source) needs to guess this from the data, which is non-trivial.

In general, it is difficult to get this right unless you are working on 1 particular GFF file as GFF is more general than GTF.

ADD REPLY
0
Entering edit mode

in gerneral, gtf is a subset of gff that is used often for counting peaks in RNA-seq data, it would be very useful if you gave more information on what you are trying to do. You can find the specifications for GFF3 here: http://www.sequenceontology.org/gff3.shtml GTF http://mblab.wustl.edu/GTF22.html and GFF http://www.sanger.ac.uk/resources/software/gff/spec.html

ADD REPLY
0
Entering edit mode

Hi,

I'm still looking for a tool that allow to make a conversion from GenBank data to gtf for species that are not in ENSEMBL database. Any suggestions?

ADD REPLY
64
Entering edit mode
11.0 years ago
gleparc ▴ 640

The easiest way is to use the gffread program that comes with the Cufflinks software suite (Tuxedo)

gffread my.gff3 -T -o my.gtf

See gffread -h for more information

ADD COMMENT
0
Entering edit mode

This should be an answer! It worked for me, thanks.

ADD REPLY
0
Entering edit mode

gffread from Cufflinks version 2.2.1 not work properly, it leaves only "gene_id" and "transcript_id" from the 9th column. E.g. exon number was stripped. I've not used the latest version, because I couldn't find the binaries and don't want to install additional packages needed for compilation.

ADD REPLY
0
Entering edit mode

Worked for me.Thank you so much.

ADD REPLY
11
Entering edit mode
12.6 years ago
Paolo ▴ 320

Take a look at the rtracklayer Bioconducor package:

?import.gff3
test_path <- system.file("tests", package = "rtracklayer")
test_gff3 <- file.path(test_path, "genes.gff3")
test <- import(test_gff3)
export(test,"test.gtf","gtf")
ADD COMMENT
0
Entering edit mode

It helped. Thanks :)

ADD REPLY
0
Entering edit mode

I'm not sure if this produce really gtf, after above commands I have "gff-version 2" in the header of exported file.

ADD REPLY
4
Entering edit mode

The GTF (General Transfer Format) is identical to GFF version 2. https://uswest.ensembl.org/info/website/upload/gff.html

ADD REPLY
3
Entering edit mode
4.8 years ago
Juke34 9.0k

I made a mini review of existing tools. See here.

ADD COMMENT
0
Entering edit mode

Thanks AGAT works well.

ADD REPLY
0
Entering edit mode

@Juke34 seems tha aga doesn' t work with gzip files? Why? Paulo

ADD REPLY
1
Entering edit mode

Yes it does not... yet. No special reason for that, it can be implemented.
Edit: It is now implemented (v0.8.1)

ADD REPLY

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6