According to the scikit-learn SGDClassifier documentation , modified Huber loss function can be used to offer higher tolerance to outliers.
Having a look at the plot of the cost function though, doesn't it seem like Modified Huber is less tolerant? It appears to give higher cost to observations with f(x)<0, i.e. to observations that lie on the wrong side of the margin. Isn't this correct?

The problem here is that the scikit-learn docs say nothing about which baseline loss function we should compare Modified Huber's tolerance to outliers to.
Modified Huber loss stems from Huber loss, which is used for regression problems. Looking at this plot, we see that Huber loss has a higher tolerance to outliers than squared loss. As you've noted, other loss functions are much more tolerant to outliers, with the exception of squared hinge loss.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With