Using ‘agg’ method of DataFrameGroupBy object, we can apply one or more aggregation functions on groupby data.
I am using below data set to demonstrate the examples.
Name Age City Gender Weight 0 Krishna 34 Bangalore Male 74 1 Chamu 25 Chennai Female 58 2 Joel 29 Hyderabad Male 85 3 Gopi 41 Hyderabad Male 87 4 Sravya 52 Bangalore Female 63 5 Raj 23 Chennai Male 79
Let’s aggregate the data by City column prior to applying the aggregation functions.
Group Name: Bangalore Name Age City Gender Weight 0 Krishna 34 Bangalore Male 74 4 Sravya 52 Bangalore Female 63 Group Name: Chennai Name Age City Gender Weight 1 Chamu 25 Chennai Female 58 5 Raj 23 Chennai Male 79 Group Name: Hyderabad Name Age City Gender Weight 2 Joel 29 Hyderabad Male 85 3 Gopi 41 Hyderabad Male 87
Example 1: Applying ‘sum’ function on Age column and ‘mean’ function on ‘Weight’ column.
agg_result = group_by_city.agg({
'Age' : 'sum',
'Weight' : 'mean'
})
In the above example, to apply multiple aggregation functions to specific columns, I passed a dictionary to the agg() method. The keys of the dictionary represent the column names, and the values can be either a single aggregation function or a list of multiple aggregation functions.
‘agg_result’ point to below dataframe.
Age Weight City Bangalore 86 68.5 Chennai 48 68.5 Hyderabad 70 86.0
Example 2: Perform multiple operations on a given column
agg_result = group_by_city.agg({
'Age' : ['sum', 'size'],
'Weight' : 'mean'
})
Above snippet apply 'sum', 'size' functions on ‘Age’ column and ‘mean’ function on ‘Weight’ column.
‘agg_result’ point to below dataset.
Age Weight sum size mean City Bangalore 86 2 68.5 Chennai 48 2 68.5 Hyderabad 70 2 86.0
Example 3: If you have only numeric columns in the data frame, and want to apply same set of operations on the columns, then you can pass list of operations to the agg method like below.
group_by_city = df[['Age', 'City', 'Weight']].groupby('City')
agg_result = group_by_city.agg(['sum', 'mean', 'size'])
‘agg_result’ point to below dataset.
Age Weight sum mean size sum mean size City Bangalore 86 43.0 2 137 68.5 2 Chennai 48 24.0 2 137 68.5 2 Hyderabad 70 35.0 2 172 86.0 2
Above snippet applies sum, mean and size operations on ‘Age’ and ‘Weight’ columns.
Find the below working Application.
one_or_more_aggregations.py
import pandas as pd
# Print the content of DataFrameGroupBy object
def print_group_by_result(group_by_object, label):
print('*'*50)
print(label,'\n')
for group_name, group_data in group_by_object:
print("Group Name:", group_name)
print(group_data)
print()
print('*' * 50)
# Create a sample DataFrame
data = { 'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
'Age': [34, 25, 29, 41, 52, 23],
'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male'],
'Weight': [74, 58, 85, 87, 63, 79]}
df = pd.DataFrame(data)
print(df)
group_by_city = df.groupby('City')
print('\nGroup by city is')
print('type of group_by_city is : ', type(group_by_city))
print_group_by_result(group_by_city, 'Group by city details')
# specify aggregation for each column separately
agg_result = group_by_city.agg({
'Age' : 'sum',
'Weight' : 'mean'
})
print('\nSum on Age column and mean on Weight column')
print(agg_result)
# Specify multiple operations on a column
agg_result = group_by_city.agg({
'Age' : ['sum', 'size'],
'Weight' : 'mean'
})
print('\nMultiple operations on a column')
print(agg_result)
group_by_city = df[['Age', 'City', 'Weight']].groupby('City')
agg_result = group_by_city.agg(['sum', 'mean', 'size'])
print('\nMultiple operations on a column')
print(agg_result)
Output
Name Age City Gender Weight 0 Krishna 34 Bangalore Male 74 1 Chamu 25 Chennai Female 58 2 Joel 29 Hyderabad Male 85 3 Gopi 41 Hyderabad Male 87 4 Sravya 52 Bangalore Female 63 5 Raj 23 Chennai Male 79 Group by city is type of group_by_city is : <class 'pandas.core.groupby.generic.DataFrameGroupBy'> ************************************************** Group by city details Group Name: Bangalore Name Age City Gender Weight 0 Krishna 34 Bangalore Male 74 4 Sravya 52 Bangalore Female 63 Group Name: Chennai Name Age City Gender Weight 1 Chamu 25 Chennai Female 58 5 Raj 23 Chennai Male 79 Group Name: Hyderabad Name Age City Gender Weight 2 Joel 29 Hyderabad Male 85 3 Gopi 41 Hyderabad Male 87 ************************************************** Sum on Age column and mean on Weight column Age Weight City Bangalore 86 68.5 Chennai 48 68.5 Hyderabad 70 86.0 Multiple operations on a column Age Weight sum size mean City Bangalore 86 2 68.5 Chennai 48 2 68.5 Hyderabad 70 2 86.0 Multiple operations on a column Age Weight sum mean size sum mean size City Bangalore 86 43.0 2 137 68.5 2 Chennai 48 24.0 2 137 68.5 2 Hyderabad 70 35.0 2 172 86.0 2
Previous Next Home
No comments:
Post a Comment