python pandas sorting techniques

Table of Contents

  1. Sorting using a single column
  2. Sorting in descending order
  3. Moving null values either top or bottom
  4. Sorting using multiple columns
  5. An overriding data frame with results
  6. Video Tutorial

1 Sorting using single column

In order to start the practical, import pandas and import the data file.

In this section we are going to sort using a single column. Say, we are going to sort according to the column Quantity. Hence, the column 2 (Quantity column) should be assigned as the index column. To do this pass 2 to the index_col() function as shown in the figure 1 and execute the code.

Figure 1: Making the Quantity column as the index column

In order to sort according to the index column, sort_index() function can be used. By default, this function sorts the values according to the index column. The code is shown below.

car_sales.sort_index()

Observe the output in figure 2. It can be seen that the data set is sorted according to the Quantity column in ascending order, but the null value is placed at last position

F
Figure 2: Sorting according to the Quantity column

2. Sorting in descending order

  1. In order to sort in descending order, another parameter should be passed into the sort_index() function. The parameter, ascending value should be set to false as shown in figure 3. In this way, the output can be sorted out in descending order. It can be also observed that the null value is at the bottom.
  2. Note that even if the sorting is done in ascending order by default, it can be also done by setting the parameter as ascending = True.
Figure 3: Sorting in descending order

3. Moving null values either top or bottom

The null values can be shifted to top or bottom by passing an additional parameter to the sort_index function. The parameter is na_position parameter.

If the null value should be shifted to the top (figure 4),

na_position = “first”

Figure 4: Shifting the null value to the top

If the null value should be shifted to the bottom (figure 5),

na_position= “last”

Figure 5: Shifting the null value to the bottom

4 Sorting using multiple columns

In section 1, 2 and 3 we discussed about sorting according to a single column. In this section we are going to sort according to multiple columns. This can be simply done by passing the required columns as a list in to the sort_values() function.

Make sure to remove the specification of index column from the first code snippet and execute it as shown in figure 6.

Figure 6: Removing the specification of the index column and executing the code

Then specify the multiple columns that should be used to sort according to as shown below,

car_sales.sort_values([‘Year’, ‘Price’])

In here, it’s specified that the data should be sorted according to Year and Price columns. This is shown in figure 7.

Figure 7: Sorting according to Year and Price columns

As another example, let’s assume that the data should be sorted according to the Quantity and Price columns. Then these two columns should be passed into the sort_values() function as a list as shown in figure 8. It can be observed that it’s sorted in ascending order figure 8: Sorting according to Quantity and Price column

Figure 8: Sorting according to Quantity and Price column

In the above scenarios, the data is sorted in ascending order. In order to sort in descending order, the ascending parameter should be set to false, as shown in figure 9. Observe the output. Now it can be seen that the data is sorted in a descending manner according to multiple columns.

Figure 9: Sorting in descending order according to multiple columns

5 Overriding data frame with results

After we are done with sorting, we need to replace our data frame with the new one. This can be done using a parameter called inplace. Set the inplace parameter value to True to override the original data frame as shown in figure 10.

Figure 10: Overriding the data frame

Video Tutorial

Leave a Reply

Your email address will not be published. Required fields are marked *