Monday, 8 April 2024

How to extract the row from multi index data frame?

In this post, I am going to explain how to extract the row from a multi index data frame using ‘loc’ accessor.

Let’s use below data set to experiment with the examples.

               Sales       City
Year Quarter                  
2022 1          100  Bangalore
     3          150  Bangalore
2020 2          115  Bangalore
2021 3          120  Hyderabad
     1          180  Hyderabad
     2           90  Hyderabad
2020 1          130    Chennai
2022 2          160    Chennai

 

As you see above snippet, ‘Year’ and ‘Quarter’ are used as indexes.

 

Example 1: Get the 3rd quarter results of year 2022.

By supplying the multi index values as a tuple, we can extract the row details.

year_2022_quarter_3_results = df.loc[(2022, 3)]

 

 

Example 2: Extract only ‘Sales’ column for the 3rd quarter results of year 2022.

 

By specifying the list of columns that we are interested as second argument, we can extract the column values of our choice.

 

year_2022_quarter_3_sales = df.loc[(2022, 3), ['Sales']]

 

Find the below working application.

 

access_row.py

 

import pandas as pd

# Create a sample DataFrame
data = {'Year': [2022, 2022, 2020, 2021, 2021, 2021, 2020, 2022],
        'Quarter': [1, 3, 2, 3, 1, 2, 1, 2],
        'Sales': [100, 150, 115, 120, 180, 90, 130, 160],
        'City': ['Bangalore', 'Bangalore', 'Bangalore', 'Hyderabad', 'Hyderabad', 'Hyderabad', 'Chennai', 'Chennai']
        }
df = pd.DataFrame(data)
print('Original DataFrame\n', df)

# Set Year and Quarter as indexes
df.set_index(['Year', 'Quarter'], inplace=True)
print('\nAfter setting index columns Year and Quarter\n',df)

year_2022_quarter_3_results = df.loc[(2022, 3)]
print('\nyear_2022_quarter_3_results\n',year_2022_quarter_3_results)

# Extract Sales column only
year_2022_quarter_3_sales = df.loc[(2022, 3), ['Sales']]
print('\nyear_2022_quarter_3_sales\n',year_2022_quarter_3_sales)

 

Output

Original DataFrame
    Year  Quarter  Sales       City
0  2022        1    100  Bangalore
1  2022        3    150  Bangalore
2  2020        2    115  Bangalore
3  2021        3    120  Hyderabad
4  2021        1    180  Hyderabad
5  2021        2     90  Hyderabad
6  2020        1    130    Chennai
7  2022        2    160    Chennai

After setting index columns Year and Quarter
               Sales       City
Year Quarter                  
2022 1          100  Bangalore
     3          150  Bangalore
2020 2          115  Bangalore
2021 3          120  Hyderabad
     1          180  Hyderabad
     2           90  Hyderabad
2020 1          130    Chennai
2022 2          160    Chennai

year_2022_quarter_3_results
 Sales          150
City     Bangalore
Name: (2022, 3), dtype: object

year_2022_quarter_3_sales
 Sales    150
Name: (2022, 3), dtype: int64

 

 

Previous                                                 Next                                                 Home

No comments:

Post a Comment