How could I create a gff file with the "Name" attribute? I have tried with both prodigal and prokka however, the gff files produced lack the name attribute which i need for a following analysis.
How could I create a gff file with the "Name" attribute? I have tried with both prodigal and prokka however, the gff files produced lack the name attribute which i need for a following analysis.
You can use awk
. Using this one-liner, the content of Dbbxref is copied and added to a new attribute "Name".
awk -F'\t' '{split($9,a,";");split(a[3],b,":"); newname=b[2]; print $0";NAME="newname}' your.gff3
This one-liner assumes that Dbbxref is the 3rd attribute in the 9th column.
If OP had formatted their added info in a more readable way, you'd probably have seen that the Dbxref is not an attribute provided by the orf caller:
NODE_23_length_59792_cov_23.204747 Prodigal_v2.6.3 CDS 1 147 19.5 - 0 ID=1_1;partial=10;start_type=TTG;rbs_motif=None;rbs_spacer=None;gc_cont=0.299;conf=98.71;score=18.89;cscore=30.86;sscore=-11.98;rscore=-0.99;uscore=-0.73;tscore=-9.61;
NODE_23_length_59792_cov_23.204747 Prodigal_v2.6.3 CDS 523 1983 198.6 - 0 ID=1_2;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.300;conf=99.99;score=198.00;cscore=196.39;sscore=1.61;rscore=-0.99;uscore=0.35;tscore=2.90;
Definitely, that is very true. I hope he can play with the command though and adapt it to his needs, it is quite straightforward. For example, to copy the content of ID attribute, just change the code to the following:
awk -F'\t' '{split($9,a,";");split(a[1],b,"="); newname=b[2]; print $0";NAME="newname}' your.gff3
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
context is missing.
for instance, I annotated my MAGs using Prokka and Prodigal respectively. The gff file that I obtain afterwards lack the attribute "Name" eg. prodigal:
prokka:
I need to get a gff file with the following attribute including "Name" eg.
(...)
sigh... This comment is not an answer, you'd better add it to your original post. And add some formatting, for example enclose the gff sections with code blocks (the 101010 icon), have each line on their own line, etc.
Getting the 'name' attribute is a data analysts job. Prodigal will give you the gene predictions, you'll need to match those with functional annotations.