In Ensembl when I use this link to return the sequence on the sense strand of X chromosome between 200000 and 200200 I get the following output:
>X dna:chromosome chromosome:GRCh37:X:200000:200200:1
CCAAACCCCAGGCAGGAGACCAGCCCGTGTTATACGGTGCCTGGAGGAGGCGTGACTCAT
TTGCATAGCGCTGAGGGGATTGGTCTGACCAGGCCTGTCATTCACGTAGCCCGCGAAAAA
CCTGGCCCGCCCACCCCAGTTCCGTAATATGCAAATGTAGGGCGCCATGATGTTCCACAC
GCCTGAGGGTAGTGGGGGCGG
This contains 201 nucleotides, but from my query I was expecting 200. Where has this extra nucleotide come from? Which position is it at? Is my query wrong?
This is absolutely fine. Look, if you specify your range as 200000:200001 you will have two nucleotides: (1) C at position 200000 and (2) C at position 200001. So, length = end - start +1.
To complete the answer: this is because Ensembl uses closed intervals both for end and beginning coordinates (ie. your end coordinate will be considered as the last one of the interval).
That's a really good way of explaining it - makes complete sense now, thank you!