In R, you can use the AnnotationHub
package to access lots of datasets hosted at the AnnotationHub. It is useful because you can easily explore the database within R, and also because datasets are provided as R data classes and are easily accessed (see vignette). For example, as of today it lists over 15000 Ensembl datasets:
library(AnnotationHub)
ah <- AnnotationHub()
The metadata can be directly explored as a data.frame
:
metadata <- mcols(ah)
Or queried with the query
function which works like a kind of grep
ens <- query(ah,"Ensembl")
ens
AnnotationHub with 15126 records
# snapshotDate(): 2019-10-29
# $dataprovider: Ensembl, UCSC, BioMart, ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Homo sapiens, Mus musculus, Rattus norvegicus, Danio rerio, Pan troglodytes, Bos taurus, Gallus gallus, Tetraodon nigroviridis, Sus scrofa, Felis ...
# $rdataclass: TwoBitFile, GRanges, EnsDb, data.frame, OrgDb, list
# additional mcols(): taxonomyid, genome, description, coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags, rdatapath,
# sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH5046"]]'
(Also, for easy annotation of your peaks with custom annotations, I find the annotatr
package to be intuitive).