Hi everyone, I am working with a data set of 800 samples labeled "healthy" and "cancerous" (140 cancerous and 660 healthy) and each sample has 10,000 features . I am trying to find most important features for separating healthy vs cancerous. so first I do a binary classification with SVM like this :
library( 'e1071' )
model <- svm(x=training_set[ , -ind_response] , y = training_set[ , ind_response] , probability=TRUE , scale=FALSE)
after this step I would like to get top 500 most important features based on this model. I tried :
library(rminer)
M<- fit(response~. , data= training_set , model = "svm" , C=1)
svm.imp<- Importance( M , data = training_set)
the last line takes a long time to execute . as I am planning to implement SVM with repeated cross fold validation this is not ideal for me. I was wondering if there is a problem with my code or if there is a way I can improve this task
ps. randomForest
function has the option to report feature importance, I was wondering if there is something similar in svm
?
thank you