Table of Contents
- Simple Classification Task using Neural Network
- Video Explanation
Simple Classification Task using Neural Network
To build a neural network in Pytorch,
- Firstly we will import the torch, torchvision, torch.nn, torchvision.transforms, torchvision.datasets, torch. autograd, variables and we import time package to see how much time it is taking to run long epoch.
- Once you import all the modules and packages, the next thing is to download all the datasets.
- Here we are downloading the Fashion_MNIST dataset.
- Here we are adding the dataset into the my_data folder
- Next, we will check how many batches you need to load the datasets and how many epochs, do need to run it.
- 6. Here in figure 1, you can see that we have given a number of epoch as 10 and batch size as 100. It means, even though you have 60,000 images for training datasets and 10,000 images for test datasets and only one time 100 images will be loaded to the main memory of the computer.
- 7. To implement that you have to use torch.utils.data.data loader and give it the name of the dataset and here I have given the name as trainset and testset.
- 8. To implement that you can check the figure 2
- 9. As you can from figure 2, the train loader will have a training dataset and it will load 100 images at one time. Inside the function, you have to pass the trainset, which is the name of the training dataset. Then batch size will be, which is the batch size number which is already been decided, you can make the number like 200, 300 or sometimes we prefer the size of 128 or 256. We have chosen 100 to make all the batch size equal.
- 10. As you can see the shuffle=True or images in this data point in this case of training and testing dataset and number of workers, that means how many threads will work to load the data and then we run it we will get the train loader and test loader which is a very useful tool in Pytorch.
- 11. We will be getting 60,000 training images and 10,000 test images. In the train loader, the number of batch size should be 60,000/100 because each length of the batch is 100 images, so the answer will be 600, then it is correct and the test loader will be having a total of 100 batches.
Just to give an idea how the dataset looks like, we will be providing a sample here:
12. Here we are using numpy and we are converting the trainset data to a numpy variable so that we can iterate over it and as you can see in figure 3. And we can receive matplotlib and here in the figure you can see the output of the first image is a shoe image and the second image is of a t-shirt.
Now, lets build the first neural network
- We have chosen the name as simple_MLP as mentioned before, if you are building a neural network model, you have to use object-oriented programming language, like python language. So you have to use class and inside that, you have to define your all methods to build a neural network.
- MLP stands for Multi-layer Perceptron. So the fully connected neural network is called MLP. Inside the multilayer perceptron, we are going to construct a class as you can see in figure 3, which is super() and it is calling itself.
- Here we have a size list, as we have called the function, we have passed a list that is 784, 100, 10 and it signifies as 784 is the resolution of the image as we flatten it will be 28 x 28. And we have 100 images per batch and we have 10 classifications.
- Layer size will provide you how many layers in my MLP, so the size list, in this case, is three. Here we are creating one linear layer and then we are passing it through a Relu layer, Relu stands for rectified linear units and it is an activation function.
- Then we are attending it to another linear layer and finally a sequential layer which combines all of them.
- In the first linear layer, we have the five dimensions of 784 by 100 and the second linear layer has the dimension layer of 100 by 10. That is how you create a linear layer in MLP which is 0 and 1, 1, and 2.
- The first layer has an input of 784 and it outputs 100 and in the second layer your input size will be 100 and the output size is 10. The output size is 10 because it has 10 classes.
- Here we are creating a loss function which cross-entropy loss and the learning rate is 0.001.
- Here we are using the optimizer as ADAM and it stands for adaptive momentum.
Then we will implement the loss function as you can see in figure 5.
- Then make sure whether you have CUDA available and check it. Then define the training method and the testing method. Then we run our neural network and you can see in figure 4 as it will take some time to run, so please be patient.
- In the training and testing method, so in the training method which is train_epoch, we are calling the model which was build or instantiated on the simple_MLP. The train loader, optimizer, and the criterion, so in this case, we are calling optimizer.zero_grad to initialize the gradients for each epoch the data is been loaded to a device which comes from the train loader and the target is being loaded to the device as long as we have mentioned in the previous content.
- The outputs are fetched from implementing models of the data and the predicted target and we have calculated the loss on the predicted target and actual target. Then we calculated loss.backward(), is basically a back-propagation and the optimizer.step is initialising widths and biases in each epoch. Then we print the training loss and to create how much time it took to run and finally we are returning the running loss in the training dataset and for the testing method we are performing the same procedure and after that, we are running the training loss and accuracy of the testing methods.
- After running it you will get the training loss and how much time it took and the testing loss and also the testing accuracy. As you can see the test accuracies increases with time and our final test accuracy is 87.79 percent.
- Then we will create a plot to check whether are the training loss is decreasing and the time and testing loss are decreasing with time as shown in figure 5, which is perfect and the model is good. Obviously, a training loss will have sharper decreasing loss in most cases, and in each epoch, the accuracy will be increasing, so the plot is good.