Entering edit mode
10.1 years ago
Endre Bakken Stovner
▴
970
I am looking at the schema for gencode gtfs and it claims the starts are one-indexed, but does not mention what indexing the ends use. I do not want to blithely assume that the ends are also 1-indexed, because the UCSC database dumps use 0-indexing for the start and one-indexing for the end so it might be that gencode uses the reverse format.
Does anyone know for sure? Sources or how you came to know/computed the answer would be nice.
I would just note it is not just the indexing but whether the interval is open ended or closed (inclusive) at the coordinates. These concepts do not mean the same thing - even though the UCSC may describe it that way, I think that just makes things more confusing. The UCSC coordinates are zero based, and open ended at the upper limit.
You are correct that the GTF is both indexed from 1 and also includes both coordinates that are listed.