Hi! I have a file containing several genebank records written one after the other. I need to extract CDS (protein sequnce(/translation), /locus_tag, /inference, /product and contig id) from all contigs. How can i do it?
The input format looks like this
And the result looks like this
How can i do this?
Since you are analyzing data, it would be helpful if you make some effort to write a small script to read a file line by line and process it.