Entering edit mode
6.5 years ago
Huichen03
▴
40
I am trying to use GRange for mapping snps to gene around 100kb. But I do not know how to download homo sapiens gene locations file including ensembl gene id, chr, start and end?
Thank you in advance!
Maybe it's worth pointing out that GTF is identical to GFF2 and GFF2 is deprecated. So, other things being equal, go for GFF3, often referred to as just GFF. Did I get it right?
Using BioMart is still the easy way since there would be further manipulations needed to get the information OP wants from these files.
Hi genomax- I agree. However, I think the advantage of going for a direct link is that you can stick in your documentation or script something like
wget ftp://ftp.ensembl.org/pub/release-92/gff3/homo_sapiens/Homo_sapiens.GRCh38.92.chr.gff3.gz
and that makes self-explanatory what data you are referring to. In my experience (maybe it's just me of course...!) with layers like BioMart either you have to be quite diligent in documenting what you have done or you end up with data that you can be sure how you got.You can use a programmatic interface to BioMart like biomaRt and then your script records the steps you took to get the data out.
dariober : Agreed. Depending on how savvy OP is they can choose an appropriate method to follow.