Monday, 15 April 2024

Pandas: Retrieve a specific group details from the grouped data

Using ‘get_group()’ method of DataFrameGroupBy object, we can get a specific group details from the grouped data based by its group name.

Example

data = {'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'Age': [34, 25, 29, 41, 52, 23],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)
group_by_city = df.groupby('City')
bangalore_person_details = group_by_city.get_group('Bangalore')

 

In the above example, I defined a DataFrame 'df' with columns "Name", "Age", "City" and "Gender". We group the DataFrame by the "City" column using groupby('City'), resulting in a DataFrameGroupBy object named 'group_by_city'. By calling the get_group() method on the 'group_by_city' object and specifying the group name ('Bangalore', 'Hyderabad', or 'Chennai' in this case), we can get a new DataFrame named 'bangalore_person_details', which contains only the rows from the original DataFrame df that belong to the group with city 'Bangalore'.

 

get_group_data.py

 

import pandas as pd

# Print the content of DataFrameGroupBy object
def print_group_by_result(group_by_object, label):
    print('*'*50)
    print(label,'\n')
    for group_name, group_data in group_by_object:
        print("Group Name:", group_name)
        print(group_data)
        print()
    print('*' * 50)


# Create a sample DataFrame
data = {'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'Age': [34, 25, 29, 41, 52, 23],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)
print(df)

group_by_city = df.groupby('City')
print('\nGroup by city is')
print('type of group_by_city is : ', type(group_by_city))
print_group_by_result(group_by_city, 'Group by city details')

bangalore_person_details = group_by_city.get_group('Bangalore')
print('\nType of bangalore_person_details : ', type(bangalore_person_details))
print('\nData in bangalore_person_details is : ')
print(bangalore_person_details)

 

Output

      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
1    Chamu   25    Chennai  Female
2     Joel   29  Hyderabad    Male
3     Gopi   41  Hyderabad    Male
4   Sravya   52  Bangalore  Female
5      Raj   23    Chennai    Male

Group by city is
type of group_by_city is :  <class 'pandas.core.groupby.generic.DataFrameGroupBy'>
**************************************************
Group by city details 

Group Name: Bangalore
      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
4   Sravya   52  Bangalore  Female

Group Name: Chennai
    Name  Age     City  Gender
1  Chamu   25  Chennai  Female
5    Raj   23  Chennai    Male

Group Name: Hyderabad
   Name  Age       City Gender
2  Joel   29  Hyderabad   Male
3  Gopi   41  Hyderabad   Male

**************************************************

Type of bangalore_person_details :  <class 'pandas.core.frame.DataFrame'>

Data in bangalore_person_details is : 
      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
4   Sravya   52  Bangalore  Female

 

Note

a.   If there are multiple columns involved in the groupby result, then Pandas return a dataframe, else it return a Series.

 


Previous                                                 Next                                                 Home

No comments:

Post a Comment