Using sort_index method, we can sort a multi indexed data frame.
Let’s experiment with the below data set.
Year Quarter Sales City 0 2022 1 100 Bangalore 1 2022 3 150 Bangalore 2 2020 2 115 Bangalore 3 2021 3 120 Hyderabad 4 2021 1 180 Hyderabad 5 2021 2 90 Hyderabad 6 2020 1 130 Chennai 7 2022 2 160 Chennai
Set the Year and Quarter as multi index columns
df.set_index(['Year', 'Quarter'], inplace=True)
Above snippet convert the data set like below.
Sales City Year Quarter 2022 1 100 Bangalore 3 150 Bangalore 2020 2 115 Bangalore 2021 3 120 Hyderabad 1 180 Hyderabad 2 90 Hyderabad 2020 1 130 Chennai 2022 2 160 Chennai
Sort the data by ascending order of index columns
new_df = df.sort_index()
new_df points to below data set.
Sales City Year Quarter 2020 1 130 Chennai 2 115 Bangalore 2021 1 180 Hyderabad 2 90 Hyderabad 3 120 Hyderabad 2022 1 100 Bangalore 2 160 Chennai 3 150 Bangalore
Sort the data by descending order of index columns
By setting the ascending argument to False, we can sort the data by ascending order of index columns.
new_df = df.sort_index(ascending=False)
‘new_df’ point to below data set.
Sales City Year Quarter 2022 3 150 Bangalore 2 160 Chennai 1 100 Bangalore 2021 3 120 Hyderabad 2 90 Hyderabad 1 180 Hyderabad 2020 2 115 Bangalore 1 130 Chennai
Sort the data by ascending order of Year and descending order of Quarter
By passing the list of Booleans to the ascending argument, we can achieve this.
new_df = df.sort_index(ascending=[True,False])
List of booleans that we passed to ascending argument are in the order of the multi index columns. In this case, True mapped to Year column and False mapped to the ‘Quarter’ column.
‘new_df’ point to below data set.
Sales City Year Quarter 2020 2 115 Bangalore 1 130 Chennai 2021 3 120 Hyderabad 2 90 Hyderabad 1 180 Hyderabad 2022 3 150 Bangalore 2 160 Chennai 1 100 Bangalore
Sort the data by descending order of Year and ascending order of Quarter
By passing the list of Booleans to the ascending argument, we can achieve this.
new_df = df.sort_index(ascending=[False,True])
‘new_df’ point to below data set.
Sales City Year Quarter 2022 1 100 Bangalore 2 160 Chennai 3 150 Bangalore 2021 1 180 Hyderabad 2 90 Hyderabad 3 120 Hyderabad 2020 1 130 Chennai 2 115 Bangalore
Find the below working application.
sort_multi_index_data.py
import pandas as pd
# Create a sample DataFrame
data = {'Year': [2022, 2022, 2020, 2021, 2021, 2021, 2020, 2022],
'Quarter': [1, 3, 2, 3, 1, 2, 1, 2],
'Sales': [100, 150, 115, 120, 180, 90, 130, 160],
'City': ['Bangalore', 'Bangalore', 'Bangalore', 'Hyderabad', 'Hyderabad', 'Hyderabad', 'Chennai', 'Chennai']
}
df = pd.DataFrame(data)
print('Original DataFrame\n', df)
# Set Year and Quarter as indexes
df.set_index(['Year', 'Quarter'], inplace=True)
print('\nAfter setting index columns Year and Quarter\n',df)
# Sort the data by ascending order of the Year and Quarter columns
new_df = df.sort_index()
print('\nSort the data set by ascending order of index columns\n',new_df)
# Sort the data by descending order of the Year and Quarter columns
new_df = df.sort_index(ascending=False)
print('\nSort the data set by descending order of index columns\n',new_df)
# Sort the data by ascending order of the Year and descending order Quarter columns
new_df = df.sort_index(ascending=[True,False])
print('\nSort the data by ascending order of the Year and descending order Quarter columns\n',new_df)
# Sort the data by descending order of the Year and ascending order Quarter columns
new_df = df.sort_index(ascending=[False, True])
print('\nSort the data by descending order of the Year and ascending order Quarter columns\n',new_df)
Output
Original DataFrame Year Quarter Sales City 0 2022 1 100 Bangalore 1 2022 3 150 Bangalore 2 2020 2 115 Bangalore 3 2021 3 120 Hyderabad 4 2021 1 180 Hyderabad 5 2021 2 90 Hyderabad 6 2020 1 130 Chennai 7 2022 2 160 Chennai After setting index columns Year and Quarter Sales City Year Quarter 2022 1 100 Bangalore 3 150 Bangalore 2020 2 115 Bangalore 2021 3 120 Hyderabad 1 180 Hyderabad 2 90 Hyderabad 2020 1 130 Chennai 2022 2 160 Chennai Sort the data set by ascending order of index columns Sales City Year Quarter 2020 1 130 Chennai 2 115 Bangalore 2021 1 180 Hyderabad 2 90 Hyderabad 3 120 Hyderabad 2022 1 100 Bangalore 2 160 Chennai 3 150 Bangalore Sort the data set by descending order of index columns Sales City Year Quarter 2022 3 150 Bangalore 2 160 Chennai 1 100 Bangalore 2021 3 120 Hyderabad 2 90 Hyderabad 1 180 Hyderabad 2020 2 115 Bangalore 1 130 Chennai Sort the data by ascending order of the Year and descending order Quarter columns Sales City Year Quarter 2020 2 115 Bangalore 1 130 Chennai 2021 3 120 Hyderabad 2 90 Hyderabad 1 180 Hyderabad 2022 3 150 Bangalore 2 160 Chennai 1 100 Bangalore Sort the data by descending order of the Year and ascending order Quarter columns Sales City Year Quarter 2022 1 100 Bangalore 2 160 Chennai 3 150 Bangalore 2021 1 180 Hyderabad 2 90 Hyderabad 3 120 Hyderabad 2020 1 130 Chennai 2 115 Bangalore
Sort at given index level
By specifying ‘level’ argument, we can sort the data frame at specific index level.
Example 1: Sort the data set by ascending order of Year column.
We can specify the index level to sort by specify the level index or level label.
new_df = df.sort_index(level=0)
new_df = df.sort_index(level='Year')
Example 2: Sort the data by descending order of the Quarter.
new_df = df.sort_index(level=1, ascending=False)
new_df = df.sort_index(level='Quarter', ascending=False)
Find the below working application.
sort_multi_index_specific_level.py
import pandas as pd
# Create a sample DataFrame
data = {'Year': [2022, 2022, 2020, 2021, 2021, 2021, 2020, 2022],
'Quarter': [1, 3, 2, 3, 1, 2, 1, 2],
'Sales': [100, 150, 115, 120, 180, 90, 130, 160],
'City': ['Bangalore', 'Bangalore', 'Bangalore', 'Hyderabad', 'Hyderabad', 'Hyderabad', 'Chennai', 'Chennai']
}
df = pd.DataFrame(data)
print('Original DataFrame\n', df)
# Set Year and Quarter as indexes
df.set_index(['Year', 'Quarter'], inplace=True)
print('\nAfter setting index columns Year and Quarter\n',df)
# Sort the data set by ascending order of Year column
new_df = df.sort_index(level=0)
print('\nSort the data set by ascending order of Year column\n',new_df)
new_df = df.sort_index(level='Year')
print('\nSort the data set by ascending order of Year column\n',new_df)
# Sort the data by descending order of the Quarter
new_df = df.sort_index(level=1, ascending=False)
print('\nSort the data by descending order of the Quarter\n',new_df)
new_df = df.sort_index(level='Quarter', ascending=False)
print('\nSort the data by descending order of the Quarter\n',new_df)
Output
Original DataFrame Year Quarter Sales City 0 2022 1 100 Bangalore 1 2022 3 150 Bangalore 2 2020 2 115 Bangalore 3 2021 3 120 Hyderabad 4 2021 1 180 Hyderabad 5 2021 2 90 Hyderabad 6 2020 1 130 Chennai 7 2022 2 160 Chennai After setting index columns Year and Quarter Sales City Year Quarter 2022 1 100 Bangalore 3 150 Bangalore 2020 2 115 Bangalore 2021 3 120 Hyderabad 1 180 Hyderabad 2 90 Hyderabad 2020 1 130 Chennai 2022 2 160 Chennai Sort the data set by ascending order of Year column Sales City Year Quarter 2020 1 130 Chennai 2 115 Bangalore 2021 1 180 Hyderabad 2 90 Hyderabad 3 120 Hyderabad 2022 1 100 Bangalore 2 160 Chennai 3 150 Bangalore Sort the data set by ascending order of Year column Sales City Year Quarter 2020 1 130 Chennai 2 115 Bangalore 2021 1 180 Hyderabad 2 90 Hyderabad 3 120 Hyderabad 2022 1 100 Bangalore 2 160 Chennai 3 150 Bangalore Sort the data by descending order of the Quarter Sales City Year Quarter 2022 3 150 Bangalore 2021 3 120 Hyderabad 2022 2 160 Chennai 2021 2 90 Hyderabad 2020 2 115 Bangalore 2022 1 100 Bangalore 2021 1 180 Hyderabad 2020 1 130 Chennai Sort the data by descending order of the Quarter Sales City Year Quarter 2022 3 150 Bangalore 2021 3 120 Hyderabad 2022 2 160 Chennai 2021 2 90 Hyderabad 2020 2 115 Bangalore 2022 1 100 Bangalore 2021 1 180 Hyderabad 2020 1 130 Chennai
No comments:
Post a Comment