Dear All
I am learning Libsvm recently, and I can use the dataset from Libsvm website to get it work (I can reach around 90% accuracy). however, when I try to apply my own unknown dataset for testing, the prediction accuracy is very low.
One thing I am not sure is the format of the unknown data. The public available libsvm data (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/), both train and test are highly mixed with positive and negtive dataset and provide the indication at the first column (with +1 or -1). But for my unknown data, what should I put at the first column. I think I should leave it empty, or just 0, otherwise if I give the indication which one is positive or negative it doesn't make sense for further prediction. please help me, thanks!!
I do not know the answer for sure, but in legacy programs I have worked with "-1" was used for unclassified data.
thanks for the reply, I tried with either all +1 or all -1, it give the different accuracy score; +1 is around 60%, and -1 only 4%; however, when I look at the file.predict, which is the out put of the prediction, the number of +1 and -1 for the prediction is the same. that's a bit weird ..
however, I used the complete true postive data as my unknown.. but only get 2/3 predicted as +1..
I'd use StackOverflow for this. Be sure to post the answer you get here!
thanks for the reply, but StackOverflow last time I made a post, but they only corrected some spell and grammer, and nobody provided an answer. And I posted same question here, within half an hour I already got several answers, and I don't know.. for me, this forum is more active ;-)