Multiclass classification
Multiclass classification is used to predict to which of multiple possible classes an observation belongs. As a supervised machine learning technique, it follows the same iterative train, validate, and evaluate process as regression and binary classification in which a subset of the training data is held back to validate the trained model.
Example – multiclass classification
Multiclass classification algorithms are used to calculate probability values for multiple class labels, enabling a model to predict the most probable class for a given observation.
Let’s explore an example in which we have some observations of penguins, in which the flipper length (x) of each penguin is recorded. For each observation, the data includes the penguin species (y), which is encoded as follows:
- 0: Adelie
- 1: Gentoo
- 2: Chinstrap
Note
As with previous examples in this module, a real scenario would include multiple feature (x) values. We’ll use a single feature to keep things simple.Expand table
Flipper length (x) | Species (y) |
167 | 0 |
172 | 0 |
225 | 2 |
197 | 1 |
189 | 1 |
232 | 2 |
158 | 0 |
Training a multiclass classification model
To train a multiclass classification model, we need to use an algorithm to fit the training data to a function that calculates a probability value for each possible class. There are two kinds of algorithm you can use to do this:
- One-vs-Rest (OvR) algorithms
- Multinomial algorithms