Genbank file format is wrong?
1
0
Entering edit mode
10.6 years ago
eddie.im ▴ 140

Hello,

I'm having a hard time to understand the logic behind the Genbank file format. In special about the "complement" feature, which doesn't seems to make sense to me, can someone help me out?

When I'm reading through a full genbank genome file, there is some annotations like: complement(1..200).

So lets say that my genome has 1000bp, What I'm expecting is that I'm pulling the sequence from range 1 to 200 of the reverse complement of the sequence in the end of the file (which i assume that is the "plus" strand" like on the example bellow:

     <--real--->
 1  ----------------------------------------------------- 1000

1000----------------------------------------------------- 1
    <possible**>                           <-expected--->

**The other possible logic is that the ranges points to reverse complement (which would be 800..1000 of the reverse strand in my logic) , but it's not the case either.

But its seems that no matter if it's complement or not the sequence is always pulled from the plus strand directly, it's not even pulled then reverse complemented, it's just pulled directly from the plus strand even tough it's pointing to the reverse strand.

I must be missing something... can someone make this clear for me?

Thanks in advance

genbank • 3.6k views
ADD COMMENT
2
Entering edit mode
10.6 years ago
Zhaorong ★ 1.4k

I don't quite get your question. But reading this may help: http://www.insdc.org/files/feature_table.html#3.4.3

ADD COMMENT
0
Entering edit mode

Hi Zhaorong, thanks a lot! It helped me!

ADD REPLY

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6