python pandas filter operations and operator or isnull not is null

Table of Contents

  1. And operator
  2. Or operator
  3. Retrieving is null
  4. Retrieving not is null
  5. In operator
  6. Not in operator
  7. Video Tutorial

1 AND operator

  1. AND operator returns values if both left and right operands are true. In python & sign is used for this purpose.
  2. First import Pandas and read the csv data file stock_data to Jupyter notebook as shown in figure 1.  The data contains the Date, Open, High, Low, Close, Adj Close, and Volume columns.
Figure 1: The stock_data csv file

Let’s assume that we want the records printed such that, their High and Low values should be 1334.800049 and 1310.099976. Specify these two values using & operator as shown in the figure 2.

Stock_data[(stock_data.High==1334.800049) & (stock_data.Low==1310.099976)]

Figure 2: Using the AND operator

Let’s take another example. This time lets let’s assume that we want the records printed such that, their High and Low values should be 1334.800049 and 1330.099976. From the data set (figure 1), we can observe that even if there is a record where the High value is 1334.800049, the Low value is not 1330.099976.

Note that, AND operator return values when both operands are true. Hence, this time there are no records to be printed as shown in figure 3.

Figure 3: No records returned if either operand in false

2. OR operator

  1. OR operator returns values if either left or right operands are true. In python Pipe | sign is used for this purpose.
  2. Let’s see how to print out the records where either the High column value is 1334.800049 or the low column value is 1330.099976. This can be done in the same way but instead of & sign, we have to put the pipe: | sign as shown below.

Stock_data[(stock_data.High==1334.800049) | (stock_data.Low==1330.099976)]

Figure 4: Using the OR operator

As another example, assume that records where the High value is 1334.800049 or the low value is 1330.099976 should be printed. This can be done by,

stock_data[(stock_data.High==1334.800049) | (stock_data.Low==1334.800049)]

As shown in figure 5, there are two records which either the High or Low value is the specified value.

Figure 5: OR operation example

3 Retrieving is null

In order to retrieve the records where the value in the Open column is null, the isnull()function can be used. It can be done in the following way,

stock_data[stock_data.Open.isnull() ]

It can be observed that there are two records where the Open values are null as shown in figure 6.

Figure 6: Using isnull function

4 Retrieving not null values

  1. In order to retrieve the record where there are no null values, we can use the notnull() function. Let’s think we want the record where the Open column values are not null. This can be done by the code,

stock_data[stock_data.Open.notnull() ]

Then execute the command. Refer to the Open column in the result in figure 7. There are no null values in the Open column.

Figure 7: Getting the record where the Open column values are not null

To get the count of the records in the list, simply use the count() function at the end as shown in figure-8

Figure 8: Getting the count

5. In operator

The isin() function can be used to filter out the records a particular or multiple values in a particular column

To demonstrate this, first, let’s print the records where the High column values are either 1334.800049 or 1349.750000 as shown in figure 9using the or operator (|).

Figure 9:  Getting the relevant records using OR operator

Then let’s use the isin() function. The advantage is that you can simply give the values in the following way using commas inside the isin function,

stock_data[stock_data.High.isin (1334.800049, 1349.750000)]

Execute the code. It can be observed by comparing figure 9 and figure 10, we are getting the same results by using both ways

Figure 10: Using the isin function to filter records

6. Not in operator

The “not” of the above operation can be done by simply adding the tide~ symbol at the beginning.

Let’s assume we want to get the records which the values of High column are not either 1334.800049, 1349.750000 or 1347.400024. So, the tide symbol should be added as shown below.

stock_data[~stock_data.High.isin (1334.800049, 1349.750000,1347400024)]

Figure 11: Getting the count

Use the count() function to display the number of records shown in the result. This is shown in figure 12. There are 19 records that satisfy the given condition

Figure 12: Getting the count

Video Tutorial

Leave a Reply

Your email address will not be published. Required fields are marked *