Friday, 5 April 2024

Pandas: Customize multi-level index names

There are two ways to customize the multi-level index names.

a.   Updating the index.names property

b.   Using index.set_names method

 

Let’s experiment with below data set.

               Sales       City
Year Quarter                  
2020 1          100  Bangalore
     2          150  Bangalore
     3          115  Bangalore
2021 1          120  Hyderabad
     2          180  Hyderabad
     3           90  Hyderabad
2022 1          130    Chennai
     2          160    Chennai

 

Updating index.names property

Following snippet update the

a.   Multi index level 0 name ‘MyYear’ and

b.   Level 1 name to ‘Quarter’.

 


df.index.names = ['MyYear', 'Quarter']

 

Using index.set_names method

Following snippet update the

a.   Multi index level 0 name ‘Year’ and

b.   Level 1 name to ‘MyQuarter’.

 


df.index.set_names(['Year', 'MyQuarter'], inplace=True)

 

Find the below working application.

 

customize_index_names.py

import pandas as pd

# Create a sample DataFrame
data = {'Year': [2020, 2020, 2020, 2021, 2021, 2021, 2022, 2022],
        'Quarter': [1, 2, 3, 1, 2, 3, 1, 2],
        'Sales': [100, 150, 115, 120, 180, 90, 130, 160],
        'City': ['Bangalore', 'Bangalore', 'Bangalore', 'Hyderabad', 'Hyderabad', 'Hyderabad', 'Chennai', 'Chennai']
        }
df = pd.DataFrame(data)
print('Original DataFrame\n', df)

# Set Year and Quarter as indexes
df.set_index(['Year', 'Quarter'], inplace=True)
print('\nAfter setting index columns Year and Quarter\n',df)

print('\nindex names : ', df.index.names)
print('\nCustomize the index names')

df.index.names = ['MyYear', 'Quarter']
print('\nindex names : ', df.index.names)
print('\nDataFrame : \n', df)

df.index.set_names(['Year', 'MyQuarter'], inplace=True)
print('\nindex names : ', df.index.names)
print('\nDataFrame : \n', df)

 

Output

Original DataFrame
    Year  Quarter  Sales       City
0  2020        1    100  Bangalore
1  2020        2    150  Bangalore
2  2020        3    115  Bangalore
3  2021        1    120  Hyderabad
4  2021        2    180  Hyderabad
5  2021        3     90  Hyderabad
6  2022        1    130    Chennai
7  2022        2    160    Chennai

After setting index columns Year and Quarter
               Sales       City
Year Quarter                  
2020 1          100  Bangalore
     2          150  Bangalore
     3          115  Bangalore
2021 1          120  Hyderabad
     2          180  Hyderabad
     3           90  Hyderabad
2022 1          130    Chennai
     2          160    Chennai

index names :  ['Year', 'Quarter']

Customize the index names

index names :  ['MyYear', 'Quarter']

DataFrame : 
                 Sales       City
MyYear Quarter                  
2020   1          100  Bangalore
       2          150  Bangalore
       3          115  Bangalore
2021   1          120  Hyderabad
       2          180  Hyderabad
       3           90  Hyderabad
2022   1          130    Chennai
       2          160    Chennai

index names :  ['Year', 'MyQuarter']

DataFrame : 
                 Sales       City
Year MyQuarter                  
2020 1            100  Bangalore
     2            150  Bangalore
     3            115  Bangalore
2021 1            120  Hyderabad
     2            180  Hyderabad
     3             90  Hyderabad
2022 1            130    Chennai
     2            160    Chennai

 

set_names method to update only specific index level name

Using set_names method, we can update the specific index level name.

 

Example

df.index.set_names('MyYear', level=0, inplace=True)

 

Above snippet update to outer most index name to MyYear.

 

update_index_names.py

 

import pandas as pd

# Create a sample DataFrame
data = {'Year': [2020, 2020, 2020, 2021, 2021, 2021, 2022, 2022],
        'Quarter': [1, 2, 3, 1, 2, 3, 1, 2],
        'Sales': [100, 150, 115, 120, 180, 90, 130, 160],
        'City': ['Bangalore', 'Bangalore', 'Bangalore', 'Hyderabad', 'Hyderabad', 'Hyderabad', 'Chennai', 'Chennai']
        }
df = pd.DataFrame(data)
print('Original DataFrame\n', df)

# Set Year and Quarter as indexes
df.set_index(['Year', 'Quarter'], inplace=True)
print('\nAfter setting index columns Year and Quarter\n',df)

df.index.set_names('MyYear', level=0, inplace=True)
df.index.set_names('MyQuarter', level=1, inplace=True)

print('\nAfter updating index column names\n',df)

Output

Original DataFrame
    Year  Quarter  Sales       City
0  2020        1    100  Bangalore
1  2020        2    150  Bangalore
2  2020        3    115  Bangalore
3  2021        1    120  Hyderabad
4  2021        2    180  Hyderabad
5  2021        3     90  Hyderabad
6  2022        1    130    Chennai
7  2022        2    160    Chennai

After setting index columns Year and Quarter
               Sales       City
Year Quarter                  
2020 1          100  Bangalore
     2          150  Bangalore
     3          115  Bangalore
2021 1          120  Hyderabad
     2          180  Hyderabad
     3           90  Hyderabad
2022 1          130    Chennai
     2          160    Chennai

After updating index column names
                   Sales       City
MyYear MyQuarter                  
2020   1            100  Bangalore
       2            150  Bangalore
       3            115  Bangalore
2021   1            120  Hyderabad
       2            180  Hyderabad
       3             90  Hyderabad
2022   1            130    Chennai
       2            160    Chennai

 

Previous                                                 Next                                                 Home

No comments:

Post a Comment