Hello,
I'm having a hard time to understand the logic behind the Genbank file format. In special about the "complement" feature, which doesn't seems to make sense to me, can someone help me out?
When I'm reading through a full genbank genome file, there is some annotations like: complement(1..200).
So lets say that my genome has 1000bp, What I'm expecting is that I'm pulling the sequence from range 1 to 200 of the reverse complement of the sequence in the end of the file (which i assume that is the "plus" strand" like on the example bellow:
<--real--->
1 ----------------------------------------------------- 1000
1000----------------------------------------------------- 1
<possible**> <-expected--->
**The other possible logic is that the ranges points to reverse complement (which would be 800..1000 of the reverse strand in my logic) , but it's not the case either.
But its seems that no matter if it's complement or not the sequence is always pulled from the plus strand directly, it's not even pulled then reverse complemented, it's just pulled directly from the plus strand even tough it's pointing to the reverse strand.
I must be missing something... can someone make this clear for me?
Thanks in advance
Hi Zhaorong, thanks a lot! It helped me!