Table of Contents
- Data Handling
- Video Tutorial
- if someone has been going through the machine learning, they must have heard about the K Nearest Neighbour
- It’s being one of the most popular algorithms in machine learning.
- Now, in this blog, we will be looking at what is KNN and why it’s known as one of the easiest algorithms.
- Now K value is one of the classification groups that I need to make using my algorithm for example if I give K=2, I need to say that I give classification classes of two and if I give 5, I want my classification into five different categories.
- So, once I have a distribution of different values accordingly, what it does is, it takes a specific value and marks the nearby neighbours according to the same and then classifies them under one specific category.
- By doing that it takes the centroid and moves it across as the value of the Euclidean or the Manhattan distance changes from one central point to the other distributed points in the given data.
2 Data Handling
- Let’s see how to handle this in our given data set.
- We are importing the necessary libraries and using the social network data for the purchase transaction that we have.
- Now we need to look into the model separation of the test and train split. So, in this, the test size we have considered is 20 percent followed by giving 80 percent of the data for the training.
- Here we take a random state of 42 in this and can be changed according to the user. Then, we need to do the standard scalar transformation once we call that pre-processing.
Now, we call our algorithm that is the KNN Classifier.
- Once a model has trained we get the value of y predicted.
- We can also check how much accuracy that we have obtained with k=5. we obtained 92.5% which is quite good. But when the k is reduced the accuracy will also get reduced.
As in the training phase of knn, we only need to store the data in RAM, there is no learning for this model, and that’s why we call it as a laziest algorithm in machine learning.