Summary
Decision boundary divides the feature space into labeled regions. We focus on linear decision boundaries.
Discriminant analysis models discriminant function $\delta_k(x)$ for each class, then classifies to $\hat{G}(x) = \arg\max_k \hat{\delta_k}(x)$.
- Decision boundary is $\delta_1(x) = \delta_2(x)$.
- Example: fit $k$ linear regression models to class indicator responses. Hyperplane decision boundaries.
Modeling the posterior $P(G \lvert X = x)$ is a form of discriminant analysis.
If monotone transformation of the discriminant function or posterior is linear, then decision boundary is linear.
- Binary example: if posterior is sigmoid($\beta^Tx$), its logit/log-odds transformation is linear, so the decision boundary is linear: $\beta^Tx = 0$.
- LDA and logistic regression both yield linear log-odds or logits; difference is the way the linear function is fit to the training data.
Alternative to discriminant analysis is explicitly modeling a linear boundary; separating hyperplane if two-class.
- Perceptron
- Optimally separating hyperplane
Generalizations: with quadratic basis expansion, linear decision boundary in augmented space map down to quadratic decision boundary in the original space.