Monday, 15 April 2024

Pandas: Get dictionary like view on group by result

Using DataFrameGroupBy.groups attribute, we can get a dictionary-like view of the groups formed in groupby operation.

Example

data = {'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'Age': [34, 25, 29, 41, 52, 23],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)

group_by_city = df.groupby('City')
dict_view = group_by_city.groups

 

In the example above, I defined a DataFrame 'df' with columns "Name", "Age", "City" and "Gender". We group the DataFrame by the "City" column using groupby('City'), resulting in a DataFrameGroupBy object named 'group_by_city'. Using the groups attribute of the grouped object 'group_by_city', we can get a dictionary that maps each unique group value ('Bangalore', 'Chennai', and 'Hyderabad' in this case) to the corresponding row indexes of the original DataFrame df that belong to that group.

 

For example, dict_view contain below information for the above example.

{'Bangalore': [0, 4], 'Chennai': [1, 5], 'Hyderabad': [2, 3]}

 

dictionary_like_view_of_groups.py

 

import pandas as pd

# Print the content of DataFrameGroupBy object
def print_group_by_result(group_by_object, label):
    print('*'*50)
    print(label,'\n')
    for group_name, group_data in group_by_object:
        print("Group Name:", group_name)
        print(group_data)
        print()
    print('*' * 50)


# Create a sample DataFrame
data = {'Name': ['Krishna', 'Chamu', 'Joel', 'Gopi', 'Sravya', "Raj"],
        'Age': [34, 25, 29, 41, 52, 23],
        'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Male', 'Female', 'Male']}

df = pd.DataFrame(data)
print(df)

group_by_city = df.groupby('City')
print('\nGroup by city is')
print('type of group_by_city is : ', type(group_by_city))
print_group_by_result(group_by_city, 'Group by city details')

dict_view = group_by_city.groups
print('\ntype of dict_view : ', type(dict_view))
print('Data in  dict_view is')
print(dict_view)

 

Output

      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
1    Chamu   25    Chennai  Female
2     Joel   29  Hyderabad    Male
3     Gopi   41  Hyderabad    Male
4   Sravya   52  Bangalore  Female
5      Raj   23    Chennai    Male

Group by city is
type of group_by_city is :  <class 'pandas.core.groupby.generic.DataFrameGroupBy'>
**************************************************
Group by city details 

Group Name: Bangalore
      Name  Age       City  Gender
0  Krishna   34  Bangalore    Male
4   Sravya   52  Bangalore  Female

Group Name: Chennai
    Name  Age     City  Gender
1  Chamu   25  Chennai  Female
5    Raj   23  Chennai    Male

Group Name: Hyderabad
   Name  Age       City Gender
2  Joel   29  Hyderabad   Male
3  Gopi   41  Hyderabad   Male

**************************************************

type of dict_view :  <class 'pandas.io.formats.printing.PrettyDict'>
Data in  dict_view is
{'Bangalore': [0, 4], 'Chennai': [1, 5], 'Hyderabad': [2, 3]}

 

  

Previous                                                 Next                                                 Home

No comments:

Post a Comment