This classifier object uses the LIBSVM software package to implement a support vector machine. A support vector machine (SVM) is a classifier that learns a function f that minimizes the hinge loss between predictions made on the training data, while also applying a penalty for more complex f (the penalty is based on the norm of f in a reproducing kernel Hilbert space). The SVM has a parameter C that controls the trade off between the empirical loss (i.e., a smaller prediction error on the training set), and the complexity of the f. SVMs can use different kernels to create nonlinear decision boundaries.
SVMs are designed to work on binary classification problems, so in order to support multi-class classification, we use two different methods. The first multi-class method is called ‘all-pairs’ and works by training separate classifiers for all pairs of labels (i.e., if there are 100 different classes then nchoosek(100, 2) = 4950 different classifiers are trained). Testing the classifier in all-pairs involves having all classifiers classify the test point, and then the class label is given to the class the was chosen most often by the binary classifiers (in the case of a tie in the number of classes that won a contest the class label is randomly chosen). The decision values for all-pairs are the number of contests won by each class (for each test point).
The second multi-class method is called ‘one-vs-all’ classification and works by training one classifier for each class (thus if there are 100 classes there will be 100 classifiers) using data from one class as the positive labels, and data from all the other classes as the negative labels. A test point is then run through these different classifiers and the class that has the largest SVM prediction value is returned as the predicted label (i.e., SVMs create a function f(x) = y, and the class label is usually given as sign(y), however here we are comparing the actual y values to determine the label). The decision values here are the f(x) = y values returned by the SVM. Our limited tests have found all pairs is faster and gives slightly more accurate results so it is the default (although the decision values might be considered more crude).
Note: To use this classifier LIBSVM must be downloaded and setup to work with Matlab. Instructions on how to setup LIBSVM can be found here.
Properties and Methods
cl = libsvm_CL
cl = train(cl, XTr, YTr)
Learns a classification function that is a linear combination of the kernel functions that use the data as parameters. The learned function tries to reduce the empirical error (error on the training set) and the complexity of the learned function.
[predicted_labels decision_values] = test(cl, XTe)
Predicts the class of each test point (XTe) based on the function learned from the training data (by taking the sign of the learned function applied to each test point).
The following properties can be set to change this classifier’s behavior (here we are assuming that svm = libsvm_CL):
svm.C: scalar (default svm.C = 1)
This is the inverse of the regularization constant (1/regularization-constant) that determines the trade-off between the fit to the training data and the amount of regularization/simplicity of the learned function. A larger value of C means more emphasis on a better fit to the training data.
svm.kernel: ‘linear’, ‘polynomial’/’poly’, or ‘gaussian’/’rbf’ (default ‘linear’).
The type of kernel used which controls the type of functions that classifier is built from.
If svm.kernel = 'polynomial', then the following options must be set:
- svm.poly_degree: scalar (no default value)
The degree of the of the polynomial.
- svm.poly_offset: scalar (default svm.poly_offset = 0)
Allows one to include a constant in the polynomial kernal.
If If svm.kernel = 'gaussian', then the following options must be set:
- svm.gaussian_gamma (no default value)
This controls the fall off of the radial basis function kernel (larger values mean a slower fall off).
sum.multiclass_classificaion_scheme (default value is ‘all_pairs’)
This field can be set to ‘all_pairs’ or ‘one_vs_all’ and determines if a one-vs-all or an all-pairs multi-class classification scheme is used, as described above (for binary problems both all-pairs and one-vs-all will return the same predicted labels, although the decision values will differ).
svm.additional_libsvm_options (default svm.additional_libsvm_options = ”)
This allows one to have additional control of the classifier’s behavior by using a string that is in LIBSVM format. For more details see: http://www.csie.ntu.edu.tw/~cjlin/libsvm/