**Table of Contents**

- What is Multiclass Classification:
- Code Example
- Video Tutorial

**1 What is the Multiclass Classification?**

Multi-class classification is a classification task that consists of more than two classes so we mentioned the number of classes as outside of regression.

- Multi-class classification is probably the most common machine and deep learning task in classification.
- Almost every neural network can be made into a classifier by simply taking a softmax function into the last layer.

**Examples**:

- Predicting animal class from an animal image is an example of multi-class classification where each animal can belong to only one category
- Predicting the digits from the handwritten takes data is another example of multi-class classification.

**2 Code Example:**

- Similar to other models, we start with input nodes these are the features that represent examples.
- we want our neural network to learn from these input data will go through multiple hidden layers in multi-class classification the neural network has the same number of output nodes as the number of classes each output node belongs to some class and output results from that class result from the last layer are passed through a softmax layer
- softmax function creates a probability distribution over in classes and produces an output vector of length in each element of the vector is the probability that the input belongs to the corresponding class the most likely class is chosen by selecting the index of that vector having the highest probability.
- Here we are going to use Keras built-in MNIST dataset this dataset is one of the most common data sets used for image classification.

- Mnist contains 60,000 training images and 10,000 testing images our main focus will be predicting digits from test images
- First, we import sequential model API from Keras , we use dense and drop-out layers so we have to import them from keras .
- Then Import to_categorical from kerasutils that will help us to convert a class vector into the binary matrix.

Now we load MNIST dataset from KERAS and load_data() will load the data and split it into training and test set.

x_train and X tests contain grayscale RGB code while white rain and whitest contain labels from 0 to 9 which represents which number they actually are to visualize any training data can get help from matplotlib library to display associated y value.

- That is the training image available in position 298 and the associated Y value is 3
- Then we reshape our data and also normalize them our image sizes 28 by 28 this reshape function will convert the dimension into 784.
- Next, we convert the data type into floor 32 then we normalize the RGB data by dividing it to the max RGB value which is 255
- Finally, we convert the Y vectors into a binary matrix using these two categorical
- Now we construct a sequential model with dense and drop out layers first we construct our dense layer with the 512 neurons as this is the first layer we have to specify the input dimension so in the first hidden layer there will be 784 inputs and 512 outputs we use the RELU as the activation function.
- The next one is another dense layer with 256 neurons than a drop out layer with a point to drop out is a technique used to prevail the model from overfitting this dropout will reduce 20% inputs at the time of model training.

- Then we have another dense layer with 64 neurons finally we have a dense output layer with the activation function softmax it converts the result into probability values.
- Data is classified into a corresponding class that has the highest probability value as we have ten classes the final dense layer will contain ten nodes means there will be ten outputs from the model.
- We compile our model as this is a multi-class classification we will use categorical cross-entropy as loss function we set rmsprop as optimizer it.
- we also use categorical accuracy as a matrix let’s compile and train our model with the training data set.
- We set Epochs as 10 and batch size as 128 as our training data set contains 60,000 samples so there will be 469 batches of 128 samples.
- It’s time for model evaluation, we’ll evaluate our model using taste data set this evaluation function will return the loss and accuracy of the model.

The model is more than 98% accurate for our test dataset with a loss of 9.04. Then comparing predicted values with actual values for 20 points.