Hello all, I currently have a GFF file with functional annotations for predicted genes in a metagenome. I have used another tool to predict functions for the same set of genes which produced a separate GFF file, and I am hoping that I can combine these two files into one that stores information on both gene functions. For example, say that I have a gene annotated as such in file A:
VMAG_100.1 PhATE CDS 1 399 . - . ID=VMAG_100.1_consensus_49_geneCall_cds; annot1=(hmm search - jackhmmer) gi|966201526|ref|YP_009191714.1| hypothetical protein T12_45 [Streptococcus phage T12]
And I have the same gene with a different annotation in file B:
VMAG_100.1 PhATE CDS 1 399 . - . ID=VMAG_100.1_consensus_49_geneCall_cds; annot1=(hmm search - jackhmmer) gi|389060239|ref|YP_006383371.1| hypothetical protein TSMG0091 [Halocynthia phage JM-2012]
How might I produce a consensus file with an output like this:
VMAG_100.1 PhATE CDS 1 399 . - . ID=VMAG_100.1_consensus_49_geneCall_cds; annot1=(hmm search - jackhmmer) gi|966201526|ref|YP_009191714.1| hypothetical protein T12_45 [Streptococcus phage T12]; annot2=(hmm search - jackhmmer) gi|389060239|ref|YP_006383371.1| hypothetical protein TSMG0091 [Halocynthia phage JM-2012];
Adding the functions manually won't work well considering I have thousands of genes in each of these files. Thank you in advance for the help!
Check
AGAT
toolkit (LINK). There should be something in there to do this.This worked great for me, thank you