# predicting missing data using linear method python pandas

1. Introduction
2. Linear method
3. Time series method
4. Video Tutorial
1. Introduction

In this practical, we are going to approximate the missing tabular values using the interpolate() function. This function does the same calculation as in mathematical interpolation.

Let’s look at the data set that is used in this practical. It’s shown in figure 1. It contains a number of vehicles that were booked by non-subscribed users and subscribed users in January 2011. Please note that a few values are missing (highlighted in red). Let’s see how the interpolate() function can be used to approximate the missing values.

2.  Linear method

First import Pandas and Numpy libraries and then import the data file. Then convert the dteday column to datetime data type. Put the dtedate column as the index column. The code snippet is shown in figure 2

Use interpolate()function to approximate the missing values in the data set as shown in figure 3. By default, this function uses linear interpolation. It can be observed that now there are values displayed where there were missing values.

Now we need to round the values. This can be done by applying the around() function which is in Numpy library.

np.around(stock_data.interpolate())

Note that the values are evenly rounded as shown in figure 4.

As previously discussed, by default interpolate() function does the linear interpolation. But if you want to, it can be explicitly specified by using an additional parameter called method. Pass the value as ‘linear’ as shown below.

np.around(stock_data.interpolate(method= “linear’))

When executed as shown in figure 5, it returns same values as in figure 4.