Table of Contents
- Regression in DeepLearning
- Code Example
- Video Tutorial
1. Regression in DeepLearning:
- In this blog, we are going to construct an MLP for regression analysis.
- Regression is a very basic type of supervised learning algorithm which tries to find the best possible straight line to describe the main trend in training data.
- Regression analysis can help us to model the relationship between a dependent variable and one or more independent variables.
- When the independent variables move by how much we can expect the dependent variable to move, regression models are used to predict a continuous value. Weights in regression that define how important each of the variables is for predicting the dependent variable.
- we use the mean squared error as a loss function and the gradient descent as the optimizer.
- This is our basic MLP model we use in regression analyses. These x1, x2 are the inputs and W’S are the weights.
- The output will be a summation of all X*W and adding a bias. It will pass through an activation function.
- Suppose for regression we use linear activation function and for classification, it will be sigmoid activation function.
2 Code Example:
- In this blog, we will use the Boston Housing Data Set, which is collected by the U.S. Census Service concerning housing in the area of Boston.
- This dataset contains 506 samples of a total of fourteen features.
The goal behind our regression problem is to use these thirteen features to predict the final column. which is the price of the house.
Code in Jupyter:
- First, we import the sequential model API from Keras. we are going to use Dense and drop out layers so we have to import them from Keras.
- Now we load the Boston housing data set from Keras, this load data function, we load the data and split it into training and test set this.
- Our training set contains 80 percent of the data, so we have 404 samples of 13 features in the training set and 102 samples in the test set
- Then we normalize our training and test data. First, we feed the scalar on the training data set, which returns a scalar with the normalized mean and standard deviation of the training data. Then we call transform data to scale both the training and test set.
- Now we generate a sequential model with dense and drop out layers. First, we construct a dense layer with 64 neurons.
- As this is the first layer, we have to specify the input dimension. As the data contains 13 features in the input, the input dimension will be 13. So in the first hidden layer, there will be 13 inputs and 64 outputs. We use Relu as our activation function.
- The next one is another dense layer with 32 neurons with the same activation function value.
- Then dropout layer with 0.2. To drop out is a technique used to prevent the model from overfitting. This dropout will reduce 20 percent input at the time of model training.
- After that, we have another dense layer with 16 neurons. Finally, we have a dense output layer. We are taking the default activation function, which is linear.
- Now we compile our regression model, we use MSE, which is the average squared error. We use the Adam optimizer, which calculates an exponential moving average of the gradient and the squared gradient and parameters control the decay rates of its moving averages.
- We are taking mean absolute error as a matric. The Matric has nothing to do with the model training. It is just a User-Friendly value that is easier to evaluate.
- It is the time to train our model we are providing the train data to the model and make it run for 800 epochs.
- The batch size is 32 as our training dataset contains 404 samples. So there will be 13 batches of 32 samples each. This verbal zero will train the model silently.
- Now, we evaluate the model as our lost function was mean squared error and metrics was a mean absolute error. This evaluation function will return a mean square error as a loss and mean absolute error as metrics.
Finally, we predict our outcomes from the model. We will compare the predicted outcome with the expected outcome. We will display only the first 10 results.
So our model predicts the outcome, which is almost similar to the expected result.