Parallelization in R 2020
0
1
Entering edit mode
4.6 years ago

Hello Biostars,

I would like to ask about the latest "news" about parallelization in R.

I have searched a bit through various articles/posts on Biostars and StackOverflow but most of them are quite old.

Has someone tried to benchmark the latest parallelization packages in R checking on versatility, usability, and speed?

If someone wants to parallelize on a laptop is there a better package to use? What if he needs later to use it on a cluster?

Background

Most of the time I work on Bioconductor Docker containers, otherwise Windows laptop.

In my project, I work with Genomic Ranges (GR) so an approach "per chromosome" should be more suitable I think, although some times I need to slice the GR even more for the approach to work with laptop's RAM.

I use also packages such as bumphunter, derfinder etc.

Thank you for your time,

Konstantinos

R parallel BiocParallel doParallel snowfall • 1.2k views
ADD COMMENT
0
Entering edit mode

Not a bioinformatics question unless you provide context also because what's best depends on the task and the hardware, i.e. not every task benefits equally from different parallelization approaches.

ADD REPLY
0
Entering edit mode

Some background what you want to parallelize and which operating system you are on?

ADD REPLY
0
Entering edit mode

@ATpoint Most of the time I work on Bioconductor Docker containers, otherwise Windows laptop. In my project, I work with Genomic ranges so an approach "per chromosome" should be more suitable I think. I use also packages such as bumphunter, derfinder etc.

@Jean-Karim Heriche Should I move it to StackOverflow?

ADD REPLY
0
Entering edit mode

Just edit your question to give bioinformatics context.

ADD REPLY
0
Entering edit mode

If RAM on a laptop is the concern, then the issue is not parallelization but splitting the GRanges it seems? How would you benefit from parallelization then?

ADD REPLY
0
Entering edit mode

In some cases that I have more than 10,000 GRs per Chr, I split them into chunks, and instead of using chromosomes, I use these chunks to perform the parallelized downstream functions.

ADD REPLY

Login before adding your answer.

Traffic: 2272 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6