how to parallelize single function in R?
2
0
Entering edit mode
9.5 years ago
na.cna30 • 0

My question has two parts:

1) I want to run my function in parallel regardless of the code inside, is possible?e.g.

data("iris")
x.train <- iris[1:100,1:4]
y.train <- iris[1:100,5]
x.test <- iris[101:150,1:4]
y.test <- iris[101:150,5]

myfun<- function(x.train,y.train,x.test,y.test) {
  library("e1071")
  model1 <- svm(x.train,y.train,type="c-classification")
  predc <<- predict(model,x.test)
  model2 <- svm(x.train,y.train,type="nu-classification")
  prednu <<- predict(model,x.test)
}

I want to parallelize this part:

myfun(x.train,y.train,x.test,y.test)

2) I also want to run the above function multiple times:

for i=1:10
  myfun(x.train,y.train,x.test,y.test)

Can you tell me how can I do these two parts in parallel in R?

PS: My original data is immense genome reads and I run over 10 classifiers, I really need do it in parallel.

R • 6.4k views
ADD COMMENT
3
Entering edit mode

This question is more adequate for StackOverflow since it is about R programming. Also, the code you have there is no way near enough to provide you with an answer. For instance the for loop syntax is not even R, and the function returns nothing - as far as I can tell. An internet search returns plenty of tutorials that should help to get started with parallelizing R functions. 1, 2, 3, 4. Good luck.

ADD REPLY
0
Entering edit mode

Thanks for your information. the function pass argument to the workspace by <<- sign.

ADD REPLY
1
Entering edit mode

And you should post first few lines of your original data. No matter how large the data, you can always do a head on it and paste a few lines. What is the role of i other than running the same code 10 times over?

ADD REPLY
0
Entering edit mode

Cool, I did not know about "<<-". Learned something today.

ADD REPLY
4
Entering edit mode

Because it's bad practice and unsafe to use the 'global assignment' operator. Parallel (or even looped) calls to this function will overwrite each other's result. The function needs to be rewritten with a normal return value.

ADD REPLY
0
Entering edit mode

Yes, I read the linked entry from Advanced R, and it looks like something one should not use unless really needed. Well, the first time I saw it was in an Advanced R book, and that tells it all :)

ADD REPLY
0
Entering edit mode

haha, perhaps learned another way of R allowing one to shoot themselves in the foot.

ADD REPLY
5
Entering edit mode
9.5 years ago
Michael 55k

Do you know the R-package parallel?

A parallel version of apply and friends is a good example for the class of problems that can be easily parallelized ("embarrassingly parallel"). The problem can be broken down into fully independent steps, like aligning N fastq sequences or applying a function to rows of a matrix. All functions compatible with apply can be used like this. Other problems are more difficult, if they need to synchronize at one point (e.g. k-means).

ADD COMMENT
1
Entering edit mode

I've had good experiences with the parLapply function of the parallel package. For a single desktop with 8 cores, it's easy to take apart a MC simulation, feed the dataset toward 8 worker nodes, and let them all have at it. I don't have to specify workload distribution, because parLapply gives each iteration to another node for me.

ADD REPLY
0
Entering edit mode

So i need to break down my function into 2 parts, each including one SVM operation. right?

ADD REPLY
0
Entering edit mode

No, I don't think so. The two svm steps need to be synchronized, because you can use the svm to predict only after it has finished training, or did you mean the two different svm's trained in one run? That you could do.

Also, you need to convert your input data into a nested list because the functions in package parallel work on lists only, there parallel apply functions in package snow, but I think this package needs a cluster of some sort.

ADD REPLY
6
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6