Guys I'm trying to replicate a work for exercise. In this work I have obtained significant genes and samples(patients). So in the classification I use the count matrix and as features I use the genes and as Samples the patients that have to be classified as Tumor or Non-Tumor. But I'm really confused. The book says :
For the multi-class SVM classification algorithm, a One-Versus-One
(OVO) approach was used. To cross validate the algorithm for all
samples in the training cohort, the SVM algorithm was trained by all
samples in the training cohort minus one, while the remaining sample
was used for (blind) classification. This process was repeated for all
samples until each sample was predicted once (leave-one-out
cross-validation [LOOCV] procedure).
I don't really see how can I use OVO with LOOCV. I know how to use LOOCV on Python on top of SVM as I know how to use OVO on Python on top of SVM but I don't see how to use both. I might be ignorant , that is why I ask here, do someone know what do they mean?
For the multi-class SVM classification algorithm, a One-Versus-One (OVO) approach was used.
SVM is a binary classifier, so it doesn't natively classify multiple classes. It has to split a multiclass problem into multiple independent binary classification tasks. There is a one-versus-rest (OVR) method which is commonly used, but I don't know what OVO is. It really shouldn't matter for solving your problem because you have a common binary classification (tumor vs. non-tumor).
LOOCV is used for a small number of samples (genes), which in your case is unknown to us. There is nothing special to it. Let's say you have 25 genes, in which case this becomes the same as k-fold cross-validation where k=25 and each sample is drawn into validation exactly once. Most SVM implementations I know about do not natively support LOOCV, so you may have to set your folds manually.
Nice answear, an Ovo approach is an One-Vs-One approach so it kinda sees all differences between classes one by one. I actually forgot that I need this approach for a multi calss too so for instance :
Tumor Classes : GBM , BrCa, PAAD,Lung,HBC,CRC
Healthy Class : HC.
Given this 7 classes I have to predict to which the sample belongs. I have 285 samples and 2519 genes.
So In that case would an approach like the one mentioned below do the work?
Training :
def loocv(train_X,train_y):
# define X and y
X = train_X
y = train_y
# define LOOCV
loo = LeaveOneOut()
loo.get_n_splits(X)
# define true and predict list
y_true,y_pred = [],[]
# run
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model = SVC(kernel='linear',random_state=0)
ovo_classifier = OneVsOneClassifier(model)
ovo_classifier.fit(X_train,y_train)
yhat = ovo_classifier.predict(X_test)
y_true.append(y_test[0])
y_pred.append(yhat[0])
return y_true,y_pred,ovo_classifier
Nice answear, an Ovo approach is an One-Vs-One approach so it kinda sees all differences between classes one by one. I actually forgot that I need this approach for a multi calss too so for instance :
Given this 7 classes I have to predict to which the sample belongs. I have 285 samples and 2519 genes. So In that case would an approach like the one mentioned below do the work?
Training :
Validation :
Result :