I have got a lot of genbank files with multiple genes in them, some of these genes have a single start and stop position e.g. 1000..1390 and some have multiple start and stop positions e.g. join(1000..1390,1400..1790,1900..2275)
I want to duplicate the entire CDS for the genes with multiple start and stop positions and insert only 1 start and stop position for every duplicate.
So 1 CDS with 3 starts/stops should become 3CDS with 1start/stop each.
Anyone got a clue on how to achieve this?
are you sure those "multiple start stop CDS" are not in fact indicating the intron/exon boundaries?