Support Vector Machines (SVM) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier.
Input vectors that support the margin are called "support vectors"
A hyperplane is a subspace of one dimension less than its ambient space. If a space is 3-dimensional then its hyperplanes are the 2-dimensional planes, while if the space is 2-dimensional, its hyperplanes are the 1 dimensional lines.
“The goal of support vector machines (SVM) is to find an optimal hyperplane that separates the data into classes.”
In case of SVM, we have a training data with the help of which we build an SVM model which can best predict the test data. There are two approaches in SVM for fitting a model using training data – Hard Margin SVM and Soft Margin SVM.
In case of a hard margin classifier we find “w” and “b” such that
ø(w)=2/||w|| is maximized for all {(xi,yi)} where yi(wT.xi+b)>=1
Soft margin doesn’t require to classify all the training data correctly, unlike hard margin. As a result, soft margin misclassifies some of the training data. However, on an average it has comparatively higher prediction accuracy than hard margin classifier for test data.
Concept of “Slack Variable” - "εi" is introduced to allow misclassification, where εi represents the distance of that point from the boundary margin for that class.
In case of a soft margin classifier we find “w” and “b” such that
ø(w)=(1/2)wTw+C∑εi is minimized for all {(xi,yi)} where yi(wT.xi+b) >= (1-εj) and εj >= 0 where j is the set of indices of violates the boundary hyper plane.
Parameter "C" can be viewed as a way to control over-fitting.
For a given point,
If 0 ≤ εi ≤ 1 then the point is classified correctly, lies in between the hyper plane and the margin on the correct side of the hyperplane. This point exhibits a margin violation.
If εi > 1 then the point is misclassified, lies on the wrong side of the hyperplane and beyond the margin.
C is a regularization parameter that controls the margin as follows:
A small value of C implies that the model is more tolerant and hence has a larger margin.
A large value of C makes the constraints hard to ignore, and hence the model has a smaller margin.
When the value of C is infinity, then all the constraints are enforced and thus the SVM model is considered a hard-margin classifier