I have a list of homo sapiens genomic loci. I would like to see what genes lie within the region of my loci, say give or take 1Mb.
I have written a script in R and am trying to use Biomart to do this.
So far I have my list called filterlist
list("1:4864876:5864876", "1:14283067:15283067", "1:21786817:22786817",
"1:33465769:34465769", "1:45300539:46300539", "1:54333815:55333815",
"1:65114236:66114236", "1:75194833:76194833", "1:86037468:87037468",
"1:96462256:97462256", "1:105259436:106259436", "1:116234756:117234756",
"1:120842170:121842170", "1:145808064:146808064", "1:155459582:156459582",
"1:166112356:167112356", "1:174453227:175453227", "1:185347260:186347260",
"1:194299241:195299241", "1:205731116:206731116")
and my code which admittedly is a bit of a cut and paste job but which doesn't seem to give me any output at all
library("biomaRt")
listMarts(host="www.ensembl.org")
ensembl = useMart(biomart = "ENSEMBL_MART_ENSEMBL",dataset="hsapiens_gene_ensembl", host = "jul2015.archive.ensembl.org")
filters = listFilters(ensembl)
filterlist<-as.list(HMapSamp$MergeCol)
results=getBM(attributes = c("hgnc_symbol","entrezgene", "chromosome_name", "start_position", "end_position"),filters = c("chromosomal_region","biotype"),values = filterlist, mart = ensembl)
I just want to get the names genes lying within (or intersecting with) the regions in my list
I think these libraries have changed how they output the genomic ranges objects.
The first ENSEMBL ID is 100126349, but how can I extract those IDs into a new data.frame? I still struggle to understand genomic range lists. :(
After an hour I found out the function that retrieves the names lol....