Tuesday, 12 December 2023

Pandas: Drop DataFrame columns

In this post, I am going to explain how to drop the columns from a DataFrame.

 

There are multiple ways to drop columns from a DataFrame.

a.   Using drop method

b.   Using pop method

c.    Using del operator

 

I am using below dataset to demonstrate the examples.

       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

 

Using drop method

 

Drop a column of a DataFrame

Syntax

df.drop(columns=column_name)

 

Example

df_drop_column_age = df.drop(columns='Age')

‘df_drop_column_age’ points to below dataset.

       Name       City  Gender  Rating
0   Krishna  Bangalore    Male      81
1     Sailu  Hyderabad  Female      76
2      Joel  Hyderabad    Male      67
3     Chamu    Chennai  Female     100
4  Jitendra  Bangalore    Male      87
5       Raj    Chennai    Male      89

Drop columns of a DataFrame

Syntax

df.drop(columns=[column_name_1, column_name_2])

Example

df_drop_column_age_and_city = df.drop(columns=['Age', 'City'])

‘df_drop_column_age_and_city’ points to below dataset.

       Name  Gender  Rating
0   Krishna    Male      81
1     Sailu  Female      76
2      Joel    Male      67
3     Chamu  Female     100
4  Jitendra    Male      87
5       Raj    Male      89

Changes of drop method are not reflected in actual DataFrame by default. To reflect the changes in actual/original dataframe set the argument inplace to True.

df.drop(columns=['Age', 'City'], inplace=True)

Find the below working application.

 

drop_columns_using_drop_method.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [81, 76, 67, 100, 87, 89]}

df = pd.DataFrame(data)
df_drop_column_age = df.drop(columns='Age')
df_drop_column_age_and_city = df.drop(columns=['Age', 'City'])

print('Original DataFrame')
print(df)

print('\ndf_drop_column_age')
print(df_drop_column_age)

print('\ndf_drop_column_age_and_city')
print(df_drop_column_age_and_city)

df.drop(columns=['Age', 'City'], inplace=True)
print('\nOriginal DataFrame')
print(df)

Output

Original DataFrame
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

df_drop_column_age
       Name       City  Gender  Rating
0   Krishna  Bangalore    Male      81
1     Sailu  Hyderabad  Female      76
2      Joel  Hyderabad    Male      67
3     Chamu    Chennai  Female     100
4  Jitendra  Bangalore    Male      87
5       Raj    Chennai    Male      89

df_drop_column_age_and_city
       Name  Gender  Rating
0   Krishna    Male      81
1     Sailu  Female      76
2      Joel    Male      67
3     Chamu  Female     100
4  Jitendra    Male      87
5       Raj    Male      89

Original DataFrame
       Name  Gender  Rating
0   Krishna    Male      81
1     Sailu  Female      76
2      Joel    Male      67
3     Chamu  Female     100
4  Jitendra    Male      87
5	      Raj    Male      89

b. Using pop method

You can also drop a column from a DataFrame using pop method. ‘pop’ method removes a column from given DataFrame and return a Series.

age_column_data = df.pop('Age')

Changes are affected in original data frame when you drop a column using pop method.

 

Find the below working application.

 

drop_columns_using_pop_method.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [81, 76, 67, 100, 87, 89]}

df = pd.DataFrame(data)
print('Original DataFrame')
print(df)

age_column_data = df.pop('Age')
print('\nDataFrame after removing Age column')
print(df)

print('\nage_column_data')
print(age_column_data)

Output

Original DataFrame
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

DataFrame after removing Age column
       Name       City  Gender  Rating
0   Krishna  Bangalore    Male      81
1     Sailu  Hyderabad  Female      76
2      Joel  Hyderabad    Male      67
3     Chamu    Chennai  Female     100
4  Jitendra  Bangalore    Male      87
5       Raj    Chennai    Male      89

age_column_data
0    34
1    35
2    29
3    35
4    52
5    34
Name: Age, dtype: int64

Can I drop multiple columns using pop method?

No, pop' method does not support removing multiple columns simultaneously.

 

c. Using del operator

You can also drop a column using ‘del’ operator.

 

Syntax

del df[column_to_drop]

drop_columns_using_del_operator.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [81, 76, 67, 100, 87, 89]}

df = pd.DataFrame(data)
print('Original DataFrame')
print(df)

del df['Age']
print('\nDataFrame after removing Age column')
print(df)

Output

Original DataFrame
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

DataFrame after removing Age column
       Name       City  Gender  Rating
0   Krishna  Bangalore    Male      81
1     Sailu  Hyderabad  Female      76
2      Joel  Hyderabad    Male      67
3     Chamu    Chennai  Female     100
4  Jitendra  Bangalore    Male      87
5       Raj    Chennai    Male      89

Can I drop multiple columns using del operator?

No, 'del' operator in Python does not support deleting multiple columns from a DataFrame simultaneously.

 

Previous                                                 Next                                                 Home

No comments:

Post a Comment