Hi all,
I am new to these bioinformatic formats.
I know in GTF format the exon1 of a transcript on forward strand is the first exon,(i.e exon 86914652 86915160)
but as for the transcripts on reverse strand, which exon is the exon1 of the transcript? is it still the first exon? or the last one?
Thank you ! So the exon1 is "exon 1091991 1092103" in the second figure?
I'am still confused about "highest coordinate" and "lowest coordinate". As the example in the second figure, which number is the highest coordinate?
Considering the orientation of the transcript on minus strand, shouldn't we see the order of those exons in reverse way compared to exons of transcripts on plus strand?
The chromosomal coordinates are written from left to right.
For your transcript on the minus strand above, the start of the transcript is 1116111, so the first exon of the transcript is the second exon in the list, the one with coordinates 1116060 to 1116111.
With this convention, the start coordinate of a gene on the minus strand will be higher than the end coordinate.
thank you!!! it helps me a lot!
Your last statement is correct, from a genomic point of view the order of exons is reversed for transcripts on the minus strand. The highest coordinate is simply the biggest number. Since 1116060 > 1091991, that would be the highest coordinate. It's the easiest to look at this in a genome browser, so I would advice you to do just that. Look at genes no the minus strand and genes on the plus strand, look at exon numbering and look at coordinates.