Entering edit mode
2.5 years ago
JC
•
0
Hello to everyone,
I am working in a eukaryotic organism and the official annotation looks like the following;
I PomBase gene 1798347 1799015 . + . ID=SPAC1002.01;Name=mrx11
I PomBase mRNA 1798347 1799015 . + . ID=SPAC1002.01.1;Parent=SPAC1002.01
I PomBase CDS 1798347 1798830 . + 0 ID=SPAC1002.01.1:exon:1;Parent=SPAC1002.01.1
I PomBase intron 1798831 1798959 . + . ID=SPAC1002.01.1:intron:1;Parent=SPAC1002.01.1
I PomBase CDS 1798960 1799015 . + 0 ID=SPAC1002.01.1:exon:2;Parent=SPAC1002.01.1
I PomBase gene 1799061 1800053 . + . ID=SPAC1002.02;Name=pom34
I PomBase mRNA 1799061 1800053 . + . ID=SPAC1002.02.1;Parent=SPAC1002.02
I PomBase five_prime_UTR 1799061 1799127 . + . ID=SPAC1002.02.1:five_prime_UTR:1;Parent=SPAC1002.02.1
I PomBase CDS 1799128 1799817 . + 0 ID=SPAC1002.02.1:exon:1;Parent=SPAC1002.02.1
I PomBase three_prime_UTR 1799818 1800053 . + . ID=SPAC1002.02.1:three_prime_UTR:1;Parent=SPAC1002.02.1
I PomBase gene 1799915 1803141 . - . ID=SPAC1002.03c;Name=gls2
I PomBase mRNA 1799915 1803141 . - . ID=SPAC1002.03c.1;Parent=SPAC1002.03c
I PomBase five_prime_UTR 1802984 1803141 . - . ID=SPAC1002.03c.1:five_prime_UTR:1;Parent=SPAC1002.03c.1
I PomBase CDS 1800212 1802983 . - 0 ID=SPAC1002.03c.1:exon:1;Parent=SPAC1002.03c.1
I PomBase three_prime_UTR 1799915 1800211 . - . ID=SPAC1002.03c.1:three_prime_UTR:1;Parent=SPAC1002.03c.1
I PomBase gene 1803624 1804491 . - . ID=SPAC1002.04c;Name=taf11
I PomBase mRNA 1803624 1804491 . - . ID=SPAC1002.04c.1;Parent=SPAC1002.04c
I PomBase five_prime_UTR 1804373 1804491 . - . ID=SPAC1002.04c.1:five_prime_UTR:1;Parent=SPAC1002.04c.1
I PomBase CDS 1803773 1804372 . - 0 ID=SPAC1002.04c.1:exon:1;Parent=SPAC1002.04c.1
I PomBase three_prime_UTR 1803624 1803772 . - . ID=SPAC1002.04c.1:three_prime_UTR:1;Parent=SPAC1002.04c.1
As you can see there are no exons annotated but I think that I can reconstruct them "manually" by doing some python script from the five_prime_UTR, three_prime_UTR and CDS... However it's easy to have a lot of exceptions with different types of transcripts. My question is, is there some tool that can easily add the lines with the feature "exon" mixing UTRs and CDS?
Thank you very much