In biopython feature.location.end
can be equal to next_feature.location.start
. For example:
type: TRANSMEM
location: [187:208]
qualifiers:
Key: description, Value: Helical. {ECO:0000255}.
type: TOPO_DOM
location: [208:411]
qualifiers:
Key: description, Value: Extracellular. {ECO:0000255}.
Although there is some biological ambiguity over this example (residue 208), in others there is not. Hence I ask which domain do the residues that are overlapped truly belong to?
Very informative, thanks. If I have understood you correctly, the only time this needs to be taken into account is when making the location integer into a human readable position, and is not a worry for the amino acids sequence? For example would there be an amino acid that would be incorrectly printed twice in
print(TRANSMEM_domain.extract(record.seq)
,TOPO_DOM_domain.extract(record.seq))
? The cookbook isn't very clear on this.Yes, in this example you'd need to be careful about "position 208" (Python zero-based counting) versus "position 209" (more human-friendly one-based counting), which is the first amino acid in the TOPO_DOM feature.
The
.extract(...)
method knows about the slicing so would do the right thing.