Hi,
I have 2 genbank files of almost identical genome assemblies with the same number of predicted genes with similar AA sequences, one genome per file:
file1: with assembly gaps but manually curated annotation, i.e. custom 'product' qualifiers
CDS complement(551913..552146)
/locus_tag="KLOVNAE_00475"
/inference="ab initio prediction:Prodigal:002006"
/codon_start=1
/product="Ubiquitin-like homologous superfamily
domain-containing protein"
...
file2: a correct assembly with automated prokka annotation only, that has some other qualifiers I would like to keep in the final genbank file, i. e. 'gene', 'inference' ...
CDS complement(551913..552146)
/gene="vUbi_1"
/locus_tag="KLOVNAE_00475"
/inference="ab initio prediction:Prodigal:002006"
/inference="similar to AA sequence:UniProtKB:P16709"
/codon_start=1
/product="viral Ubiquitin"
...
Therefore I would like to replace the 'product' qualifier of all CDS features in file2 with the respective ones from the manually curated file 1.
How could I achieve that eg. in Biopython? Hints/Help would be highly appreciated!
thx atb flo
do both files use the same geneIDs ? (== can you link on gene name or such?)
all genes of both files have the same locus_tags: KLOVNAE_00001 ... 01000
I have included examples of the CDS features...