Wednesday 17 April 2024

Pandas: Get the maximum value for each group

Using 'DataFrameGroupBy.max()' method, we can get the maximum value for each group.

Example

data = { 'Age': [34, 25, 29, 41, 52, 23],
        'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)

group_by_city = df.groupby('City')
max_ele_in_each_group = group_by_city.max()

 

In the example above, I defined a DataFrame df with columns "Age", "Name", "City" and "Gender", grouped the DataFrame by the "City" column using groupby('City'), resulting in a DataFrameGroupBy object named 'group_by_city'.

 

By calling the max() method on the 'group_by_city' object, we can obtain a new DataFrame named 'max_ele_in_each_group'. This DataFrame contains the maximum value from the all the rows for each group based on the grouping criteria. The index of the DataFrame represents the unique group values ('Bangalore', 'Hyderabad' and 'Chennai' in this case), and the "Value" column contains the corresponding row from each group.

 

Note

a. ‘max()’ method applies the maximum calculation to all columns in the DataFrame by default.

 

Find the below working application.

 

max_value_of_each_group.py

 

import pandas as pd

# Print the content of DataFrameGroupBy object
def print_group_by_result(group_by_object, label):
    print('*'*50)
    print(label,'\n')
    for group_name, group_data in group_by_object:
        print("Group Name:", group_name)
        print(group_data)
        print()
    print('*' * 50)


# Create a sample DataFrame
data = { 'Age': [34, 25, 29, 41, 52, 23],
        'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)
print(df)

group_by_city = df.groupby('City')
print('\nGroup by city is')
print('type of group_by_city is : ', type(group_by_city))
print_group_by_result(group_by_city, 'Group by city details')

max_ele_in_each_group = group_by_city.max()
print('\nType of max_ele_in_each_group : ', type(max_ele_in_each_group))
print('\nData in max_ele_in_each_group is : ')
print(max_ele_in_each_group)

Output

   Age     Name       City  Gender
0   34  Krishna  Bangalore    Male
1   25    Chamu    Chennai  Female
2   29     Joel  Hyderabad    Male
3   41     Gopi  Hyderabad    Male
4   52   Sravya  Bangalore  Female
5   23      Raj    Chennai    Male

Group by city is
type of group_by_city is :  <class 'pandas.core.groupby.generic.DataFrameGroupBy'>
**************************************************
Group by city details 

Group Name: Bangalore
   Age     Name       City  Gender
0   34  Krishna  Bangalore    Male
4   52   Sravya  Bangalore  Female

Group Name: Chennai
   Age   Name     City  Gender
1   25  Chamu  Chennai  Female
5   23    Raj  Chennai    Male

Group Name: Hyderabad
   Age  Name       City Gender
2   29  Joel  Hyderabad   Male
3   41  Gopi  Hyderabad   Male

**************************************************

Type of max_ele_in_each_group :  <class 'pandas.core.frame.DataFrame'>

Data in max_ele_in_each_group is : 
           Age    Name Gender
City                         
Bangalore   52  Sravya   Male
Chennai     25     Raj   Male
Hyderabad   41    Joel   Male

  

Previous                                                 Next                                                 Home

No comments:

Post a Comment