Table of Contents
- Convolutional neural network
- Code Example
- Video Tutorial
1 Convolutional neural networks:
- The convolutional neural network is one of the variants of neural networks used heavily in the field of computer vision and image processing.
- It derives its name from convolution as at least one of the layers involved in the convolutional operation.
- The convolution is applied to the input data using a convolution filter to produce a feature map. We perform the convolution operation by sliding this filter over the input at every location.
- we do element-wise matrix multiplication and sum the result. This sum goes into the feature map.
- But in CNN, a neuron connects to a neuron only close to it and all have the same weight.
- The hidden layers of CNN typically consist of convolutional layers, pulling layers, fully connected layers, and normalization layers.
- CNN is a pioneer in the domain of image processing and computer vision.
- Here is a small example for CNN.CNN takes an input signal and applies a filter over. It essentially multiplies the input signal with the kernel to get the modified signal.
- Then the signal will reach Pooler, which usually reduces the dimensionality. This enables us to reduce the number of parameters which both shortens the training time and combats overweighting.
- The most common type of pooling is max polling, which just takes the max value in the polling window. We basically use a couple of convolutional and pooling layers.
- Then the signal may go through one or more fully connected layers, which is basically dense layers as CNN involves in multiclass classification in the final output layer. We use the softmax activation function.
- The softmax activation function creates a probability distribution over in classes and produces an output vector of length.
- Each element of the vector is the probability that the input belongs to the corresponding class. The most likely class is chosen by selecting the index of that vector, having the highest probability.
we are going to use the CIFAR data set. It is an established image processing data set used for object recognition. The data consists of sixty thousand 32 by 32-pixel color images, often classes with six thousand images per class. After splitting the dataset, we will get 50000 training images and ten thousand test images.
These nine are examples of input images and they have a total of nine classes. The input of our model will be the images and our model will try to predict the class. Suppose for this horse image. Our model should return seven, which means horse.
Model Training and Evaluation:
- Now we are going to Jupiter to work with the CNN model. First, we import the sequential model API from Keras. In this blog, we are going to use basic dense and dropout layers so we have to import them from Keras.
- Convolutional2D and Max pool is specially used for CNN and we have to import them as well.
- Image data generator is used to generate batches of Tensor images with real-time data augmentation.
- We import rmsprop optimizer from Keras .keras_utils to help us to convert our data to binary metrics. Matplotlib will help us to display the image data and numpy is going to help us to process our data.
- Now we load the CIFER dataset from the load function which will help us to load the data and split it into training and test set
- Then we reshape our data and also normalize them using mean and standard deviation.
- After that, we convert our output data into binary metrics based on categorical will help us to convert the dimension into ten.
- The shape of training data is 32x32x3 and this 3 is for RGV channels. Our output dimension is ten.
- Now we construct a sequential model. Then we add the First layer as a convolutional layer with 32 filters with the kernel size three by three, we use RELU as the activation function.
- We also enable padding equals to the same means. We pad with the number on the eight as this is the first layer. We have to mention the input dimension, which is 32 by 32 by three.
- Then we have another convolutional layer with the same 32 filters of size three by three. After that, we have our mikes falling to the with the window size two by two, it will reduce the dimensionality into a half.
- After this layer, the output shape will be half of its input shape. Then we have a drop out layer. It will reduce 20 percent of inputs at the time of model training. Then we have another set of Max pooling to the end drop out layers.
- This flatted layer will flatten the output. Finally, we have our dense output layer with 10 output nodes as this is our multiclass classification. We said activation function as softmax.
- Image data generator is used to generate batches of Tensor images with real-time data augmentation. Image data is generated by transforming actual training images by rotation and flips.
- We compile our model as this is a multi-class classification we will use categorical cross-entropy as a loss function we set rmsprop as the optimizer.
- we also use categorical accuracy as a matrix let’s compile and train our model with the training data set.
- Now start the training using the model. fit() function by providing data into it. Here we set Epochs as 10 and batch size as 128 as our training data set contains 50,000 samples so there will be 391 batches of 128 samples.
- It’s time for model evaluation, and our validation data is ready x_test and y_test .we’ll evaluate our model using the taste data set this evaluation function will return the loss and accuracy of the model.
- Our model is around 74% accurate for our test dataset with a loss of 9.04.
- Then we will predict and visualize output for 20 images using the model. predict as it generates 10 dimension vector we use argmax to get the max probability value and label that index as the predicted value.