Monday, 18 December 2023

Pandas: Filter rows of a DataFrame using where method

Using ‘where()’ method, we can filter the rows of a DataFrame.

I am using below data set to demonstrate the examples.

       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      39
1     Sailu   35  Hyderabad  Female      43
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      41
5       Raj   34    Chennai    Male      89

Example 1: Get all the users who are from the city 'Bangalore'

bangalore_users = df.where(df['City'] == 'Bangalore')

‘bangalore_users’ point to the below data set.

        Name   Age       City Gender  Rating
0   Krishna  34.0  Bangalore   Male    39.0
1       NaN   NaN        NaN    NaN     NaN
2       NaN   NaN        NaN    NaN     NaN
3       NaN   NaN        NaN    NaN     NaN
4  Jitendra  52.0  Bangalore   Male    41.0
5       NaN   NaN        NaN    NaN     NaN

As you see above output, ‘bangalore_users’ contains NaN values for rows where the condition is not met, and it retains the original values where the condition is met.

 

You can also provide an alternative value to replace the NaN values by passing  the second argument to the where() method.

 

For example,

bangalore_users = df.where(df['City'] == 'Bangalore', 'not_matched')

In the above example, ‘bangalore_users’ point to below data set.

           Name          Age         City       Gender       Rating
0      Krishna           34    Bangalore         Male           39
1  not_matched  not_matched  not_matched  not_matched  not_matched
2  not_matched  not_matched  not_matched  not_matched  not_matched
3  not_matched  not_matched  not_matched  not_matched  not_matched
4     Jitendra           52    Bangalore         Male           41
5  not_matched  not_matched  not_matched  not_matched  not_matched

In this case, the rows that don't meet the condition are replaced with the string 'not_matched' in the resulting DataFrame.

 

Example 2: Get the rows whose age is 34 or Gender is 'Female'

users_age_is_34_or_female = df.where((df['Age'] == 34) | (df['Gender'] == 'Female'), 'not_matched')

 

'users_age_is_34_or_female' will point to below data set.

           Name          Age         City       Gender       Rating
0      Krishna           34    Bangalore         Male           39
1        Sailu           35    Hyderabad       Female           43
2  not_matched  not_matched  not_matched  not_matched  not_matched
3        Chamu           35      Chennai       Female          100
4  not_matched  not_matched  not_matched  not_matched  not_matched
5          Raj           34      Chennai         Male           89

Find the below working application.

 

where_method.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [39, 43, 67, 100, 41, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)

# Get the rows whose city is 'Bangalore'
bangalore_users = df.where(df['City'] == 'Bangalore')
print('\nbangalore_users\n', bangalore_users)

bangalore_users = df.where(df['City'] == 'Bangalore', 'not_matched')
print('\nbangalore_users\n', bangalore_users)

# Get the rows whose age is 34 or Gender is 'Female'
users_age_is_34_or_female = df.where((df['Age'] == 34) | (df['Gender'] == 'Female'), 'not_matched')
print('\nusers_age_is_35_or_female\n', users_age_is_34_or_female)

Output

Original DataFrame
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      39
1     Sailu   35  Hyderabad  Female      43
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      41
5       Raj   34    Chennai    Male      89

bangalore_users
        Name   Age       City Gender  Rating
0   Krishna  34.0  Bangalore   Male    39.0
1       NaN   NaN        NaN    NaN     NaN
2       NaN   NaN        NaN    NaN     NaN
3       NaN   NaN        NaN    NaN     NaN
4  Jitendra  52.0  Bangalore   Male    41.0
5       NaN   NaN        NaN    NaN     NaN

bangalore_users
           Name          Age         City       Gender       Rating
0      Krishna           34    Bangalore         Male           39
1  not_matched  not_matched  not_matched  not_matched  not_matched
2  not_matched  not_matched  not_matched  not_matched  not_matched
3  not_matched  not_matched  not_matched  not_matched  not_matched
4     Jitendra           52    Bangalore         Male           41
5  not_matched  not_matched  not_matched  not_matched  not_matched

users_age_is_35_or_female
           Name          Age         City       Gender       Rating
0      Krishna           34    Bangalore         Male           39
1        Sailu           35    Hyderabad       Female           43
2  not_matched  not_matched  not_matched  not_matched  not_matched
3        Chamu           35      Chennai       Female          100
4  not_matched  not_matched  not_matched  not_matched  not_matched
5          Raj           34      Chennai         Male           89

 

Previous                                                 Next                                                 Home

No comments:

Post a Comment