Explain support vector machines. What is the margin, what do kernels do, and when would you still use an SVM?
formulate your answer, then —
tldr
SVMs learn maximum-margin classifiers. Soft-margin SVMs trade margin against violations through C. Kernels allow nonlinear boundaries through implicit feature maps, but kernel SVMs can be expensive at scale and need calibration for probabilities.
follow-up
- How do
Cand RBFgammaaffect overfitting? - Why might logistic regression be preferable if you need probabilities?
- Why do kernel methods scale poorly with large datasets?