Hello Everyone,
I am generating multiple large lists of genomic regions. To find out what genes are in these regions usually I would submit the regions to ensemble biomart. However now I generate too many lists for it to be feasible to do this manually. Does anyone know if it is possible to (simply - I only have basic programming skills) automate this process with a script or if there is a similar service that can do this automatically with list of files or some commands.
ps I would also be interested if people know/use a similar automated approach for looking for pathway enrichment, as this is what I will then do with the list of genes.
Thanks very much in advance for your help!
Best regards,
Rubal
Which command you use for it, a wrapper script can be made according to that, if you tell us.
Currently I go to ensembl biomart using the browser, select the ensebml genes database, choose homo sapiens as species, then under filters, select multiple chromosomal regions and link to a file containing these regions. Then this outputs a file containing the gene ids in the region
Try using biomaRt r package, you can pull data into R interface directly and can either export it or process it. You can make automated Rscripts for that. I don't work with it, so maybe someone else, working on it will help you with documentation and scripts. Tutorial
biomart looks perfect for this thank you. It even has some relevant examples in the documentation. Assuming there is a way to submit a list of regions in Biomart rather than one region at a time this will be great (I assume there is a way to do this, I will keep looking)