Count number of Genes in GFF3 file (Prokka output)
2
0
Entering edit mode
4.1 years ago
Bioinfo ▴ 20

Hello .

please i have run prokka on eight strains and i have eight GFF files , i want to count number of genes in each GFF file , please tell me how to do that

Thank you

assembly sequence gene sequencing • 1.7k views
ADD COMMENT
0
Entering edit mode

If your GFF3 files are in proper format you could count occurrences of gene in field 3 in your files.

ctg123 . gene            1000  9000  .  +  .  ID=gene00001;Name=EDEN
ADD REPLY
0
Entering edit mode

Thank you for your answer but i think that gene is repeated more than one time for one gene , is like that

scaffold1_size61041 prokka  gene    29  829 .   +   .   ID=wAse_00001_gene;locus_tag=wAse_00001
scaffold1_size61041 Prodigal:002006 CDS 29  829 .   +   0   ID=wAse_00001;Parent=wAse_00001_gene;inference=ab init
io prediction:Prodigal:002006;locus_tag=wAse_00001;product=hypothetical protein
scaffold1_size61041 prokka  gene    1262    2740    .   +   .   ID=wAse_00002_gene;locus_tag=wAse_00002
scaffold1_size61041 Prodigal:002006 CDS 1262    2740    .   +   0   ID=wAse_00002;Parent=wAse_00002_gene;inference=ab init
io prediction:Prodigal:002006;locus_tag=wAse_00002;product=hypothetical protein
ADD REPLY
0
Entering edit mode

Those appear to be separate genes.

ADD REPLY
1
Entering edit mode
4.1 years ago
Juke34 8.9k

You can use agat_sp_statistics.pl from AGAT

ADD COMMENT
1
Entering edit mode
4.1 years ago
h.mon 35k

Prokka outputs a very detailed log, it will be the ${prefix}.log file in the output folder. E..g:

grep "Found" ~/projects/annotation/prokka/ngs56.log
 [15:23:01] Found 51 tRNAs
 [15:23:15] Found 7 rRNAs
 [15:25:16] Found 47 ncRNAs.
 [15:25:56] Found 3 CRISPRs
 [15:26:10] Found 2049 CDS
 [15:26:59] Found 892 unique /gene codes.
  
ADD COMMENT

Login before adding your answer.

Traffic: 1862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6