Hi Everyone,
I have a .fasta file with functional and GO annotations. I also have an associated GFF3 file with the locations of these genes in the genome. The IDs from the GFF3 and fasta match. I need to append the annotation information from the fasta header to the notes column of the appropriate lines in the GFF3 file. something like this:
Fasta Headers:
>evm.model.Scaffold_003599.4 Protein=Olfactory_receptor GO=GO:0004930,...
Current GFF3:
Scaffold_003599 EVM mRNA 187035 187979 . + . ID=evm.model.Scaffold_003599.4
Desired GFF3:
Scaffold_003599 EVM mRNA 187035 187979 . + . ID=evm.model.Scaffold_003599.4 Protein=Olfactory_receptor GO=GO:0004930,
Basically, I need to replace each
ID=...
(in the gff)
with
ID=... Protein=... GO=...
(from the fasta headers)
I feel like this should not be that difficult a task, but it is just out of my range of scripting skills at the moment (or it would take me a long time to figure out how to do this by trial and error).
Does anyone know of a current tool or script to accomplish this?
Thanks!
reference this post .