**Table of Contents**

- Introduction
- Linear method
- Time series method
- Video Tutorial

**Introduction**

In this practical, we are going to approximate the missing tabular values using the **interpolate() **function. This function does the same calculation as in mathematical interpolation.

Let’s look at the data set that is used in this practical. It’s shown in *figure 1*. It contains a number of vehicles that were booked by non-subscribed users and subscribed users in January 2011. Please note that a few values are missing (highlighted in red). Let’s see how the **interpolate()** function can be used to approximate the missing values.

**2. Linear method**

First import Pandas and Numpy libraries and then import the data file. Then convert the dteday column to datetime data type. Put the dtedate column as the index column. The code snippet is shown in *figure 2*

Use **interpolate()**function to approximate the missing values in the data set as shown in *figure 3*. By default, this function uses linear interpolation. It can be observed that now there are values displayed where there were missing values.

Now we need to round the values. This can be done by applying the **around()** function which is in Numpy library.

**np.around(stock_data.interpolate())**

Note that the values are evenly rounded as shown in *figure 4*.

As previously discussed, by default interpolate() function does the linear interpolation. But if you want to, it can be explicitly specified by using an additional parameter called **method**. Pass the value as **‘linear’ **as shown below.

**np.around(stock_data.interpolate(method= “linear’))**

When executed as shown in *figure 5*, it returns same values as in figure 4.

**3 Time series method**

Please note the dates in dteday column. It can be seen that many dates are not recorded. As an example, between 2011-01-08 and 2011-01-13 there are no records. Hence the approximated values from linear method are not accurate. Hence, the time should be also considered when interpolating. This can be done by time series method.

Specify the **method** parameter as **‘Time’** in order to interpolate using time series method as shown below.

**np.around(stock_data.interpolate(method= “time’))**

Observe the approximated values in *figure 6*. Those values are different than the values that were obtained using the linear method.