I am trying to upload annotation file to the galaxy rna seq work flow, but i have annotation in the .gff format, what is the easiest way to convert .gff file to .gtf file
I am trying to upload annotation file to the galaxy rna seq work flow, but i have annotation in the .gff format, what is the easiest way to convert .gff file to .gtf file
Few other programs that do the conversion.
genome tools
http://genometools.org/tools/gt_gff3_to_gtf.html
ea-utils
https://github.com/ExpressionAnalysis/ea-utils/blob/master/clipper/gff2gtf
pasa
https://github.com/PASApipeline/PASApipeline/blob/master/misc_utilities/gff3_to_gtf_format.pl
kent utils:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/
gff3ToGenePred followed by genePredToGtf
GFFtools-GX
https://github.com/vipints/GFFtools-GX/blob/master/gff_to_gtf.py
Great list from @Jeffin, I would add gffread and agat_convert_sp_gff2gtf.pl
from AGAT
I tested the different solution with this gff3 test file, and we can see that results differ from method used:
##gff-version 3
scaffold625 maker gene 337818 343277 . + . ID=CLUHARG00000005458;Name=TUBB3_2
scaffold625 maker mRNA 337818 343277 . + . ID=CLUHART00000008717;Parent=CLUHARG00000005458
scaffold625 maker tss 337915 337918 . + . ID=CLUHART00000008717:tss;Parent=CLUHART00000008717
scaffold625 maker CDS 337915 337971 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker CDS 340733 340841 . + 0 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker CDS 341518 341628 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker CDS 341964 343033 . + 2 ID=CLUHART00000008717:cds;Parent=CLUHART00000008717
scaffold625 maker exon 337818 337971 . + . ID=CLUHART00000008717:exon1;Parent=CLUHART00000008717
scaffold625 maker exon 340733 340841 . + . ID=CLUHART00000008717:exon2;Parent=CLUHART00000008717
scaffold625 maker exon 341518 341628 . + . ID=CLUHART00000008717:exon3;Parent=CLUHART00000008717
scaffold625 maker exon 341964 343277 . + . ID=CLUHART00000008717:exon4;Parent=CLUHART00000008717
scaffold625 maker five_prime_utr 337818 337914 . + . ID=CLUHART00000008717:five_prime_utr;Parent=CLUHART00000008717
scaffold625 maker three_prime_UTR 343034 343277 . + . ID=CLUHART00000008717:three_prime_utr;Parent=CLUHART00000008717
AGAT agat_convert_sp_gff2gtf.pl)
##gtf-version 3
scaffold625 maker gene 337818 343277 . + . ID CLUHARG00000005458; Name TUBB3_2; gene_id CLUHARG00000005458
scaffold625 maker transcript 337818 343277 . + . ID CLUHART00000008717; Parent CLUHARG00000005458; gene_id CLUHARG00000005458; original_biotype mrna; transcript_id CLUHART00000008717
scaffold625 maker exon 337818 337971 . + . ID "CLUHART00000008717:exon1"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker exon 340733 340841 . + . ID "CLUHART00000008717:exon2"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker exon 341518 341628 . + . ID "CLUHART00000008717:exon3"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker exon 341964 343277 . + . ID "CLUHART00000008717:exon4"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker CDS 337915 337971 . + 0 ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker CDS 340733 340841 . + 0 ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker CDS 341518 341628 . + 2 ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker CDS 341964 343033 . + 2 ID "CLUHART00000008717:cds"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker five_prime_utr 337818 337914 . + . ID "CLUHART00000008717:five_prime_utr"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; transcript_id CLUHART00000008717
scaffold625 maker three_prime_utr 343034 343277 . + . ID "CLUHART00000008717:three_prime_utr"; Parent CLUHART00000008717; gene_id CLUHARG00000005458; original_biotype three_prime_UTR; transcript_id CLUHART00000008717
gffread
scaffold625 maker transcript 337818 343277 . + . transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker exon 337818 337971 . + . transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker exon 340733 340841 . + . transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker exon 341518 341628 . + . transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker exon 341964 343277 . + . transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker CDS 337915 337971 . + 0 transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker CDS 340733 340841 . + 0 transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker CDS 341518 341628 . + 2 transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
scaffold625 maker CDS 341964 343033 . + 2 transcript_id "CLUHART00000008717"; gene_id "CLUHARG00000005458";
it didn't fit in one post, here the rest:
genome tools
scaffold625 maker exon 337818 337971 . + . gene_id "1"; transcript_id "1.1";
scaffold625 maker exon 340733 340841 . + . gene_id "1"; transcript_id "1.1";
scaffold625 maker exon 341518 341628 . + . gene_id "1"; transcript_id "1.1";
scaffold625 maker exon 341964 343277 . + . gene_id "1"; transcript_id "1.1";
scaffold625 maker CDS 337915 337971 . + 0 gene_id "1"; transcript_id "1.1";
scaffold625 maker CDS 340733 340841 . + 0 gene_id "1"; transcript_id "1.1";
scaffold625 maker CDS 341518 341628 . + 2 gene_id "1"; transcript_id "1.1";
scaffold625 maker CDS 341964 343033 . + 2 gene_id "1"; transcript_id "1.1";
ea-utils
scaffold625 maker exon 337818 337971 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker CDS 337915 337971 0 + 0 gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker CDS 340733 340841 0 + 0 gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker exon 340733 340841 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker CDS 341518 341628 0 + 2 gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker exon 341518 341628 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker CDS 341964 343033 0 + 2 gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
scaffold625 maker exon 341964 343277 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717:CLUHARG00000005458";
pasa (you need the fasta sequence too)
scaffold625 maker gene 337818 343277 0 + . gene_id "CLUHARG00000005458"; Name "TUBB3_2";
scaffold625 maker transcript 337818 343277 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker exon 337818 337971 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker CDS 337818 337971 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker exon 340733 340841 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker CDS 340733 340841 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker exon 341518 341628 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker CDS 341518 341628 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker exon 341964 343277 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
scaffold625 maker CDS 341964 343277 0 + . gene_id "CLUHARG00000005458"; transcript_id "CLUHART00000008717"; Name "TUBB3_2";
kent utils => => I didn't succeed to make it run (on osx)
GFFtools-GX => I didn't succeed to make it run
From the different solutions, some loose attributes information, some do not remove not accepted feature type (3rd colum), some remove accepted feature type for GTF format (see [here][2] for the list of accepted feature type in GTF)...
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Refer this
Have you tried to search
converting .gff file to .gtf
? I got several hits, from BioStars, SeqAnswers, ResearchGate... For example,gffread
is cited very often, you could give it a try.