Tuesday, 14 November 2023

Filter a Pandas DataFrame using comparison operators

In this post, I am going to explain how to filter a DataFrame using comparison operators.

 

I am going to use below data set to demonstrate the examples.

       Name  Age       City  Gender  Percentage
0   Krishna   34  Bangalore    Male          81
1     Sailu   35  Hyderabad  Female          76
2      Joel   29  Hyderabad    Male          67
3     Chamu   35    Chennai  Female         100
4  Jitendra   52  Bangalore    Male          87
5       Raj   34    Chennai    Male          89

 

Following comparison operators supported by Pandas to filter the data set.

 

a.   == (equal to)

b.   != (not equal to)

c.    > (greater than)

d.   < (less than)

e.   >= (greater than or equal to)

f.     <= (less than or equal to)

 

Extract the rows with index 2 and 5

To extract the rows with index 2 and 5, we need to define a Boolean series, where the values at indexes 2 and 5 are set to True and set to False at remaining places.

0    False
1    False
2     True
3    False
4    False
5     True

my_list = [False, False, True, False, False, True]
my_series = pd.Series(my_list)

 

Pass the Boolean series to data frame to extract the rows at index 2 and 5. In this case, df[my_series] return the rows at indexes 2 and 5.

   Name  Age       City Gender  Percentage
2  Joel   29  Hyderabad   Male          67
5   Raj   34    Chennai   Male          89

 

Extract the rows where Gender is Male

male_gender_boolean_series = (df['Gender'] == 'Male')
male_gender_rows = df[male_gender_boolean_series]

Above statement return below rows.

       Name  Age       City Gender  Percentage
0   Krishna   34  Bangalore   Male          81
2      Joel   29  Hyderabad   Male          67
4  Jitendra   52  Bangalore   Male          87
5       Raj   34    Chennai   Male          89

Extract the rows where the Percentage is > 85

percentage_greater_85_boolean_series = (df['Percentage'] > 85)
percentage_greater_85_rows = df[percentage_greater_85_boolean_series]

Above statement return below rows.

       Name  Age       City  Gender  Percentage
3     Chamu   35    Chennai  Female         100
4  Jitendra   52  Bangalore    Male          87
5       Raj   34    Chennai    Male          89

filter_by_comparison_operator.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Percentage': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)

print('\nExtract the rows with index 2 and 5')
my_list = [False, False, True, False, False, True]
my_series = pd.Series(my_list)
print(df[my_series])

print('\nExtract the rows where gender is Male')
male_gender_boolean_series = (df['Gender'] == 'Male')
male_gender_rows = df[male_gender_boolean_series]
print(male_gender_rows)

print('\nExtract the rows where Percentage > 85')
percentage_greater_85_boolean_series = (df['Percentage'] > 85)
percentage_greater_85_rows = df[percentage_greater_85_boolean_series]
print(percentage_greater_85_rows)

Output

Original DataFrame
       Name  Age       City  Gender  Percentage
0   Krishna   34  Bangalore    Male          81
1     Sailu   35  Hyderabad  Female          76
2      Joel   29  Hyderabad    Male          67
3     Chamu   35    Chennai  Female         100
4  Jitendra   52  Bangalore    Male          87
5       Raj   34    Chennai    Male          89

Extract the rows with index 2 and 5
   Name  Age       City Gender  Percentage
2  Joel   29  Hyderabad   Male          67
5   Raj   34    Chennai   Male          89

Extract the rows where gender is Male
       Name  Age       City Gender  Percentage
0   Krishna   34  Bangalore   Male          81
2      Joel   29  Hyderabad   Male          67
4  Jitendra   52  Bangalore   Male          87
5       Raj   34    Chennai   Male          89

Extract the rows where Percentage > 85
       Name  Age       City  Gender  Percentage
3     Chamu   35    Chennai  Female         100
4  Jitendra   52  Bangalore    Male          87
5       Raj   34    Chennai    Male          89



 

Previous                                                 Next                                                 Home

No comments:

Post a Comment