I know of at least two coordinate systems in use for describing genomic positions. Does anyone know which system is used by Biomart (specifically, the biomaRt R package) when it returns genomic coordinates?
I know of at least two coordinate systems in use for describing genomic positions. Does anyone know which system is used by Biomart (specifically, the biomaRt R package) when it returns genomic coordinates?
BioMart, whether web-based or via biomaRt is essentially an interface to Ensembl. Ensembl uses "1-based" coordinates. From their tutorial (scroll down to "Coordinates"):
Ensembl, and many other bioinformatics applications, use inclusive coordinates which start at 1. The first nucleotide of a DNA sequence is 1 and the first amino acid of a protein sequence is also 1. The length of a sequence is defined as end - start + 1.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
As BioMart is a tool to provide a web-frontend to any relational database, I would rather put "provides an interface" than "is" ;-) In another case, it might well depend on the data source it is built upon (in this case I agree with Ensembl of course).
Well yes, BioMart does more than access Ensembl; a little lazy of me, but fine in the context of this question.
Well yes, BioMart does more than access Ensembl; a little lazy of me, but fine in the context of this question. The point being that the coordinate system of interest is a property of Ensembl, not BioMart.