I did a 5-fold cross-validation for testing a binary classifier. Using the results, I constructed a ROC curve for each iteration with ROCR package:
Is there any way to construct a mean ROC curve based on these 5 curves?
I did a 5-fold cross-validation for testing a binary classifier. Using the results, I constructed a ROC curve for each iteration with ROCR package:
Is there any way to construct a mean ROC curve based on these 5 curves?
What happens in 5-fold cross-validation is that you train on 4 of the folds and predict on the hold-out fold. This procedure is repeated until all folds have hold-out predictions on them. Given this preamble, you could just concatenate the scores and labels from the 5 hold-out predictions (which should comprise all of your data points), and then construct a single ROC curve based on that.
Ideally with enough data and randomization of which samples fall into each fold, there should not be large discrepancies between the ROC curve of each fold individually.
Yes, I meant to concatenate them into a single vector. Nearly all methods provide a score which is used for ranking. It's odd that it's not provided as output. They may have an internal score that is just not getting returned to the user. You might need to contact the authors or see if you can modify their code to return the score as an output.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
This seems off-topic for biostars? Perhaps more appropriate to ask on stats.stackexchange.com