One Class SVM
An algorithm for novelty detection and empirical distribution support estimation
Last updated
Was this helpful?
An algorithm for novelty detection and empirical distribution support estimation
Last updated
Was this helpful?
Assume the points are separable from the origin, then what is the maximum margin?
The problem can be formulated as
for all data point . The problem is equivalent to the binary case where and are the two classes.
The Lagrangian is
with KKT conditions where .
Take gradient
In other words,
or
Thus, the soft margin loss function
Thus, the dual problem is
Hence we have the quadratic program
Then, . The dual problem Lagrangian is
Without loss of generality, we set .
The can be solved by plugging in the support vectors, .
The decision function is given by and for all data points.
Now we allow the violations of classification criterion , and panelize the misclassifications using Hinge loss
Introduce slack variable as the discrepancy for each violation. With help of slack variables, we have
where and . The gradients are
where in the original paper and thus is the upper bound of number of outliers.