Hello,
Can someone please help me with this issue I'm having? Thank you in advance!
I have a GFF file, and I have the gene_name attribute in my GFF file, but it's only present for each gene entry (i.e., it's absent from the transcript, cds, and exon rows). I want to add the gene_name attribute to every single row of my GFF file, so that for each feature type, the gene_name will be listed and it will match the existing gene_name attribute for the Parent gene feature. Can you please help me with this?
I tried using agat_sp_manage_attributes.pl
from AGAT, but it only allows me to add new attributes as empty values.
This is what I have:
scaffold1 maker gene 288062 292903 . - . ID=XUN_000004;gene_id=XUN_000004;gene_name=LINGO2
scaffold1 maker transcript 288062 292903 . - . ID=XUN_000004-T1;Parent=XUN_000004;Dbxref=PFAM:PF13306,PFAM:PF07679,PFAM:PF13927,PFAM:PF00047,PFAM:PF13855;gene_id=XUN_000004;note=COG:T,EggNog:ENOG503B9G0;original_biotype=mrna;product=Leucine-rich repeat and immunoglobulin-like domain-containing nogo receptor-interacting protein 2;transcript_id=XUN_000004-T1
scaffold1 maker exon 288062 289918 . - . ID=XUN_000004-T1.exon2;Parent=XUN_000004-T1;gene_id=XUN_000004;transcript_id=XUN_000004-T1
scaffold1 maker exon 292892 292903 . - . ID=XUN_000004-T1.exon1;Parent=XUN_000004-T1;gene_id=XUN_000004;transcript_id=XUN_000004-T1
scaffold1 maker CDS 288062 289918 . - 0 ID=XUN_000004-T1.cds;Parent=XUN_000004-T1;gene_id=XUN_000004;transcript_id=XUN_000004-T1
scaffold1 maker CDS 292892 292903 . - 0 ID=XUN_000004-T1.cds;Parent=XUN_000004-T1;gene_id=XUN_000004;transcript_id=XUN_000004-T1
scaffold1 maker gene 295971 297761 . + . ID=XUN_000005;gene_id=XUN_000005;gene_name=XUN_000005
scaffold1 maker transcript 295971 297761 . + . ID=XUN_000005-T1;Parent=XUN_000005;gene_id=XUN_000005;original_biotype=mrna;product=hypothetical protein;transcript_id=XUN_000005-T1
This is what I need:
scaffold1 maker gene 288062 292903 . - . ID=XUN_000004;gene_id=XUN_000004;gene_name=LINGO2
scaffold1 maker transcript 288062 292903 . - . ID=XUN_000004-T1;Parent=XUN_000004;Dbxref=PFAM:PF13306,PFAM:PF07679,PFAM:PF13927,PFAM:PF00047,PFAM:PF13855;gene_id=XUN_000004;gene_name=LINGO2;note=COG:T,EggNog:ENOG503B9G0;original_biotype=mrna;product=Leucine-rich repeat and immunoglobulin-like domain-containing nogo receptor-interacting protein 2;transcript_id=XUN_000004-T1
scaffold1 maker exon 288062 289918 . - . ID=XUN_000004-T1.exon2;Parent=XUN_000004-T1;gene_id=XUN_000004;gene_name=LINGO2;transcript_id=XUN_000004-T1
scaffold1 maker exon 292892 292903 . - . ID=XUN_000004-T1.exon1;Parent=XUN_000004-T1;gene_id=XUN_000004;gene_name=LINGO2;transcript_id=XUN_000004-T1
scaffold1 maker CDS 288062 289918 . - 0 ID=XUN_000004-T1.cds;Parent=XUN_000004-T1;gene_id=XUN_000004;gene_name=LINGO2;transcript_id=XUN_000004-T1
scaffold1 maker CDS 292892 292903 . - 0 ID=XUN_000004-T1.cds;Parent=XUN_000004-T1;gene_id=XUN_000004;gene_name=LINGO2;transcript_id=XUN_000004-T1
scaffold1 maker gene 295971 297761 . + . ID=XUN_000005;gene_id=XUN_000005;gene_name=XUN_000005
scaffold1 maker transcript 295971 297761 . + . ID=XUN_000005-T1;Parent=XUN_000005;gene_id=XUN_000005;gene_name=XUN_000005;original_biotype=mrna;product=hypothetical protein;transcript_id=XUN_000005-T1
does the solution have to be Perl or are you comfortable using other scripting languages?
Hi jv , I'm comfortable using any scripting language for this. Thank you