In this post, I am going to explain how to drop the columns from a DataFrame.
There are multiple ways to drop columns from a DataFrame.
a. Using drop method
b. Using pop method
c. Using del operator
I am using below dataset to demonstrate the examples.
Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Jitendra 52 Bangalore Male 87 5 Raj 34 Chennai Male 89
Using drop method
Drop a column of a DataFrame
Syntax
df.drop(columns=column_name)
Example
df_drop_column_age = df.drop(columns='Age')
‘df_drop_column_age’ points to below dataset.
Name City Gender Rating 0 Krishna Bangalore Male 81 1 Sailu Hyderabad Female 76 2 Joel Hyderabad Male 67 3 Chamu Chennai Female 100 4 Jitendra Bangalore Male 87 5 Raj Chennai Male 89
Drop columns of a DataFrame
Syntax
df.drop(columns=[column_name_1, column_name_2])
Example
df_drop_column_age_and_city = df.drop(columns=['Age', 'City'])
‘df_drop_column_age_and_city’ points to below dataset.
Name Gender Rating 0 Krishna Male 81 1 Sailu Female 76 2 Joel Male 67 3 Chamu Female 100 4 Jitendra Male 87 5 Raj Male 89
Changes of drop method are not reflected in actual DataFrame by default. To reflect the changes in actual/original dataframe set the argument inplace to True.
df.drop(columns=['Age', 'City'], inplace=True)
Find the below working application.
drop_columns_using_drop_method.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
df_drop_column_age = df.drop(columns='Age')
df_drop_column_age_and_city = df.drop(columns=['Age', 'City'])
print('Original DataFrame')
print(df)
print('\ndf_drop_column_age')
print(df_drop_column_age)
print('\ndf_drop_column_age_and_city')
print(df_drop_column_age_and_city)
df.drop(columns=['Age', 'City'], inplace=True)
print('\nOriginal DataFrame')
print(df)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Jitendra 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 df_drop_column_age Name City Gender Rating 0 Krishna Bangalore Male 81 1 Sailu Hyderabad Female 76 2 Joel Hyderabad Male 67 3 Chamu Chennai Female 100 4 Jitendra Bangalore Male 87 5 Raj Chennai Male 89 df_drop_column_age_and_city Name Gender Rating 0 Krishna Male 81 1 Sailu Female 76 2 Joel Male 67 3 Chamu Female 100 4 Jitendra Male 87 5 Raj Male 89 Original DataFrame Name Gender Rating 0 Krishna Male 81 1 Sailu Female 76 2 Joel Male 67 3 Chamu Female 100 4 Jitendra Male 87 5 Raj Male 89
b. Using pop method
You can also drop a column from a DataFrame using pop method. ‘pop’ method removes a column from given DataFrame and return a Series.
age_column_data = df.pop('Age')
Changes are affected in original data frame when you drop a column using pop method.
Find the below working application.
drop_columns_using_pop_method.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
age_column_data = df.pop('Age')
print('\nDataFrame after removing Age column')
print(df)
print('\nage_column_data')
print(age_column_data)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Jitendra 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 DataFrame after removing Age column Name City Gender Rating 0 Krishna Bangalore Male 81 1 Sailu Hyderabad Female 76 2 Joel Hyderabad Male 67 3 Chamu Chennai Female 100 4 Jitendra Bangalore Male 87 5 Raj Chennai Male 89 age_column_data 0 34 1 35 2 29 3 35 4 52 5 34 Name: Age, dtype: int64
Can I drop multiple columns using pop method?
No, pop' method does not support removing multiple columns simultaneously.
c. Using del operator
You can also drop a column using ‘del’ operator.
Syntax
del df[column_to_drop]
drop_columns_using_del_operator.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
del df['Age']
print('\nDataFrame after removing Age column')
print(df)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Jitendra 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 DataFrame after removing Age column Name City Gender Rating 0 Krishna Bangalore Male 81 1 Sailu Hyderabad Female 76 2 Joel Hyderabad Male 67 3 Chamu Chennai Female 100 4 Jitendra Bangalore Male 87 5 Raj Chennai Male 89
Can I drop multiple columns using del operator?
No, 'del' operator in Python does not support deleting multiple columns from a DataFrame simultaneously.
No comments:
Post a Comment