Table of Contents
- Code Example
- Video Tutorial
- Svm very popular and very productive algorithm in machine learning
- Linear Support vector machines are quite similar to logistic regression bust uses the additional concept of margin maximization.
- The points which are on the margin or inside the margin are called support vectors.
- In general Kernel, SVMs are more popular than linear SVM as kernelization allows the data to transform to higher dimensions.
- Once the support vector machines draw the optimal hyperplane we can classify the labels.
2 Code Example of support vector machines in sklearn
- we have now imported the necessary libraries and we have a special data set that is the social network’s ad.
- we can see the data using dataframe.head representing the first five values by default.
- Given the specific features of user ID, gender, age, and estimated salary we need to predict the person will make a transaction.
- Let’s look into it how do we need to handle Firstly we need to understand that every feature in the dataset is not important. we have to select the necessary data or the features.
- Then use those features for training and prediction. If you add the useless data or the features which are not useful in the prediction that just increases the noise and reduces the accuracy of the model.
- We’ll be using gender age and estimated salary as a value of X predicting the value of the purchase of buying or not for the same zero represents that the value the purchase is not been happening and one represents the purchase did happen.
- Once that is done let’s split our data using the test train split used in model selection now testing splits allows you to distribute your data into two major categories of test and training
- From this training, the data model learns how to change itself according to the values and followed by the testing phase in which the user or the developer test how much accurately the model has learned.
- The test size that we have defined is 20% and the random State we are keeping is 42 now random state.
- Then we need to scale the data which is a very important part of pre-processing once that is done we call the SVM classification.
- we are calling support vector classifier once that has been done we need to call the specific kernel which is available for different purposes and different kernels which are used according to different data set which is being provided so by default the kernel is RBF and the random States initialized with 0.
- First, let’s try to handle how much is our prediction score with the RBF kernel.
We can see 0.92 that means that the accuracy is being obtained to 92% which is a very good score.
Now let’s try to change the kernel type and let’s see what results we obtain from that so pressing Shift + tab to get the documentation followed by using LINEAR kernel this time training the model and the predictive values as we can see we saw a little bit of change and let’s see if it was for good or bad.
We can see the accuracy is being reduced to 86% and our rbf was much better value for the kernel.