How to solve multiple CDS features with same locus tag in EMBL format
1
0
Entering edit mode
3.6 years ago

I have generated some EMBL files from gff3 format and am ending up with multiple CDS features at the same locus tag. From the ones I have checked, it is 2 CDS features per locus tag. Is this because there is one CDS on the forward strand and one on the reserves strand?

The tool I am wanting to use does not allow for multiple CDS on the same locus tag so how can I avoid this happening or a simple way to change it automatically?

Example EMBL: https://pastebin.com/LkzfB0nF

annotation embl • 1.5k views
ADD COMMENT
1
Entering edit mode
3.6 years ago
Juke34 8.9k

There are two CDS because there are two isoforms

ADD COMMENT
0
Entering edit mode

Is there a way I can avoid this? Or at least give different locus tags. I want to use antiSMASH and it does not allow multiple CDS at one locus tag

ADD REPLY
1
Entering edit mode

Prior conversion you can use AGAT to keep only longest isoforms.

ADD REPLY
0
Entering edit mode

This have solved the issue, thank you!

ADD REPLY
0
Entering edit mode

You can accept the parent answer to provide closure to this thread.

ADD REPLY

Login before adding your answer.

Traffic: 1800 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6