Table of Contents
- Convolutional neural network implementation
- Video Explanation
Convolutional neural network implementation
- To implement CNN on Pytorch we have to import torch, torchvision, torchvision.nn, torchvision.transforms, torchvision.datasets, torch.autograd and variable.
- We have chosen a number of epochs as 10. we can select our own batch size to optimize our models, as discussed in the previous content, here we are choosing the batch size as 100 and the learning point as 0.001.
- Now we will briefly discuss the torchvision.transforms function, so it is a class that has a method called .compose that lets you standardize your dataset. So it works by passing on the list of transformations inside the compose method as you can see in figure 1, so here we have transforms.ToTensor () and we have also used transform.Normalise(). So our normalization constant is for height is 0.1307 and width is 0.3081.
- Then we are creating a folder call data were we are keeping our dataset as you can see the figure 1, as you can see this is a fashion_MNIST data set
Once our loading is done for the previous process we can move forward with the next loading process by loading our data using a data loader, as you can see in figure2, which will be the structure of the convolutional neural network.
- As you can see in figure 2, we are using two CNN layers. As we have discussed in the before content, we can choose the convolutional layer and pooling layer, and later on we can pass it on to the multi-layer perceptron which eventually gives us the classification, or I can say the probabilities.
- In the first convolutional layer, we have 32 output channels as you can see the figure 2, one input channel in which the kernel size is 5, which means there are 5 filters. The stride is 1, as we have discussed before we can down samples in two ways one is stridden convolution and another one is padding and max pooling. So here in the figure-2, you can see we are using pooling for down sampling, so the padding is 2 x 2 which means it adds two layers of zeros on each that means on all four sides.
- The activation function is Relu and we are using max pool today, as you can remember that the pulling layer can be implemented using two functions which is maxpool2d and average pool2d. Here we are using maxpool2d and the stride is 2 so that we can down sample.
- In figure 2 , you can see that we have 32 input channels, 64 output channels, 5 filters, stride is 1 and padding is 2. The activation layer is again Relu and maxpool2d are being used for pooling.
- Then there is a linear function, after the convolution and pooling you have to implement an MLP, MLP we are using the linear layer, so here we have two multi-layer perceptrons which are 7*7*64 input channels and 1000 output channels and the second one has 1000 input channels and 10 output channels, the output channel has defined as 10 because we have 10 classes. And then we put everything inside the forward pass and then we have to define a backward so that we can calculate the backward function.
- Then we create the class of ConvNet and all the other structures we have discussed before.
- So first we have instantiated the ConvNet class and then a loss is a cross-entropy loss and the optimizer is the ADAM.
- So lets train the model and after we train the model we can calculate the test accuracy of our model, so here we are calling directly as model.eval(), which lets us implement the model in the test dataset and to calculate the accuracy we are creating certain for loop in that we are calculating the highest rate of probability level and we are also calculating how many of them are equal to the actual levels. And then we are summing them up to calculate, to check whether how many of them are correct and here we are calculating the percentage.
- After running the code we can see that the accuracy of our model is 90.78 percent, which is a really good value.