Script for getting summary statistic of any genome using GTF or GFF3 ?
2
1
Entering edit mode
9.1 years ago
vahapel ▴ 210

Hi All,

I am looking for a script for obtaining summary statistics such as; transcript number, base numbers, length, intron length etc., using GTF/GFF3 file(s) and genome.

Thank you for all your help!

gene genome sequence • 9.0k views
ADD COMMENT
8
Entering edit mode
9.1 years ago
Juke34 8.9k

There are several solutions for that: _*Updated to put everything in one place_

  • In Perl I use agat_sp_statistics.pl from the gff toolkit AGAT. See here for an example sample of the output. This solution has advantage to work with any kind of GTF/GFF flavor (even not sorted and with errors).
  • In Python GAG is a good solution for that purpose: http://genomeannotation.github.io/GAG. From a directory where you have your genome (genome.fasta) and your annotation (genome.gff), you launch GAG, then you load the files by typing "load" (by default it will look for genome.fasta and genome.gff), and finally you type "info" and you will have a complete summary statistics of your annotation. It works perfectly fine with gff3 format.
  • In Perl+bash there is GFF-Ex, when I tried it, it din't work for me. (Maybe due to the specific gff flavour I was using)
  • In Bash using awk or grep commands
  • There are solutions in R, see here for an example.
  • Using GenomeTools with the command gt stat
  • bedtools
  • gffutils

Related posts:

ADD COMMENT
0
Entering edit mode

Hi Juke-34, thank you for introducing "GAG" to me, it is perfectly suited for the project.

ADD REPLY
3
Entering edit mode
9.1 years ago
h.mon 35k

bedtools probably does a lot of what you want, have a look at its documentation and usage examples.

ADD COMMENT
1
Entering edit mode

Dear h.mon,

BedTools is perfect in many aspect, during my search for gff3 parsing, I encountered some very useful tools;

It can also be useful for gff (probaly works for gtf if small changes made) parsing.

ADD REPLY

Login before adding your answer.

Traffic: 1915 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6