Dear all,
I am trying to convert a genbank file into genome feature format version 3. Using the GFF3 online validator the file I have prepared are OK but there are issues with some of the phases of the CDS. I am summarizing the issues in this table:
entry phase start end delta Delta third
1 0 76014 76437 423 141.00
2 2 76532 76794 262 87.33
3 0 76901 77296 395 131.67
4 0 82169 82217 48 16.00
5 2 82326 83644 1,318 439.33
6 0 83960 85309 1,349 449.67
7 1 86174 88442 2,268 756.00
8 0 88544 89010 466 155.33
9 1 89700 90945 1,245 415.00
10 0 91042 91496 454 151.33
entry phase Minus 1 Minus 1 third Minus 2 Minus 2 third
1 0 422 140.67 421 140.33
2 2 261 87.00 260 86.67
3 0 394 131.33 393 131.00
4 0 47 15.67 46 15.33
5 2 1317 439.00 1316 438.67
6 0 1348 449.33 1347 449.00
8 1 2267 755.67 2266 755.33
9 0 465 155.00 464 154.67
10 1 1244 414.67 1243 414.33
11 0 453 151.00 452 150.67
entry
is a given CDS feature.
phase
is the suggested phase from the validator
start
is the start position of the CDS and end
its end position
delta
is the difference end - start
delta third
is delta/3
Minus 1
is delta -1
and Minus 1 third
is (delta -1)/3
Minus 2
is delta -2
and Minus 2 third
is (delta -2)/3
Taking the entry #1, its length delta is divisible by 3 so it makes sense that the validator accepted a phase of 0. Same thing for feat. 4, 6, 8, 10.
The second entry it is not divisible by 3 directly so it is understandable that the validator has flagged it out. Shortening the length of the feature by 1 nucleotide (Minus 1), the feature is now divisible by 3, thus I expected to change the phase to 1, not 2. Same thing for feat. 5.
The third feature is also not divisible by 3, yet the validator did not flag it. It is divisible by 3 after removing two nucleotides, thus I thought the phase should have been 2.
Features 7 and 9 are divisible by 3, yet are flagged with a phase of 1.
It is pretty confusing for me and I haven't found many tutorials online on the subject.
How do I calculate the phase of the CDS based on the start-end positions?
Thanks
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
I voted the comment to the answer; anyway, I now upvoted and checked the answer itself. Thanks.
Cheers - votes on answers are supposed to indicate the relevance of these answer to future users, it's a bit more prominent and has reason (not only human narcissism) ;-)