Monday, 8 April 2024

Pandas: pivot: Reshape the dataframe

In this post, I am going to explain how to reshape the dataframe using pivot method.

 

Signature

DataFrame.pivot(index=None, columns=None, values=None)’

 

Following table summarize the parameters of pivot method.

 

Parameter

Description

index

Specify the column(s) to use as the new DataFrame index.

columns

The column(s) to use as the new DataFrame columns.

values

The column(s) to use as the values in the new DataFrame.

 

Let’s try to experiment with below dataset.

       Name  Age       City  Gender      Education
0   Krishna   34  Bangalore    Male       Graduate
1     Sailu   35  Hyderabad  Female  Post Graduate
2      Joel   29  Hyderabad    Male            PHD
3     Chamu   35    Chennai  Female       Graduate
4  Jitendra   52  Bangalore    Male       Graduate
5   Krishna   34    Chennai    Male   Intermediate

Let’s reshape the data frame by setting

a.   ['Education', 'Age'] columns as index columns

b.   ['City', 'Gender'] as column headers

c.    ['Name'] as column values

 

new_df = df.pivot(index=['Education', 'Age'], columns=['City', 'Gender'], values = ['Name'])

 

new_df points to below data set.

                        Name                                 
City              Bangalore Hyderabad       Chennai         
Gender                 Male    Female  Male  Female     Male
Education     Age                                           
Graduate      34    Krishna       NaN   NaN     NaN      NaN
              35        NaN       NaN   NaN   Chamu      NaN
              52   Jitendra       NaN   NaN     NaN      NaN
Intermediate  34        NaN       NaN   NaN     NaN  Krishna
PHD           29        NaN       NaN  Joel     NaN      NaN
Post Graduate 35        NaN     Sailu   NaN     NaN      NaN

 

Find the below working application.

 

pivot.py
import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Krishna"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Education': ['Graduate', 'Post Graduate', 'PHD', 'Graduate', 'Graduate', 'Intermediate']}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)

new_df = df.pivot(index=['Education', 'Age'], columns=['City', 'Gender'], values = ['Name'])
print('\nnew_df\n',new_df)

Output

Original DataFrame
       Name  Age       City  Gender      Education
0   Krishna   34  Bangalore    Male       Graduate
1     Sailu   35  Hyderabad  Female  Post Graduate
2      Joel   29  Hyderabad    Male            PHD
3     Chamu   35    Chennai  Female       Graduate
4  Jitendra   52  Bangalore    Male       Graduate
5   Krishna   34    Chennai    Male   Intermediate

new_df
                        Name                                 
City              Bangalore Hyderabad       Chennai         
Gender                 Male    Female  Male  Female     Male
Education     Age                                           
Graduate      34    Krishna       NaN   NaN     NaN      NaN
              35        NaN       NaN   NaN   Chamu      NaN
              52   Jitendra       NaN   NaN     NaN      NaN
Intermediate  34        NaN       NaN   NaN     NaN  Krishna
PHD           29        NaN       NaN  Joel     NaN      NaN
Post Graduate 35        NaN     Sailu   NaN     NaN      NaN

 

 

Previous                                                 Next                                                 Home

No comments:

Post a Comment