Tuesday, 12 December 2023

Pandas: Drop DataFrame rows

Using ‘drop’ method we can drop the rows from a DataFrame either using row index or row label.

 

I am using below dataset to demonstrate the examples.

       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

Drop a row at specific index

Syntax

df.drop(row_index)

Example

df_without_row_2 = df.drop(2)

df_without_row_2 contain below dataset.

       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

Drop rows at indexes 3 and 5

Syntax

df.drop([row_index_1, row_index_2])

Example

df_drop_rows_3_and_5 = df.drop([3, 5])

df_drop_rows_3_and_5 contain below dataset.

       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
4  Jitendra   52  Bangalore    Male      87

Unless you set the argument ‘inplace’ to True, drop method do not have any impact on original DataFrame. To reflect the changes in original data frame set inplace to True.

df.drop([3, 5], inplace=True)

Find the below working application.

 

drop_rows_using_index.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [81, 76, 67, 100, 87, 89]}

df = pd.DataFrame(data)
df_without_row_2 = df.drop(2)
df_drop_rows_3_and_5 = df.drop([3, 5])

print('Original DataFrame')
print(df)

print('\ndf_without_row_2')
print(df_without_row_2)

print('\ndf_drop_rows_3_and_5')
print(df_drop_rows_3_and_5)

df.drop([3, 5], inplace=True)
print('\nOriginal DataFrame after dropping rows 3 and 5')
print(df)

Output

Original DataFrame
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

df_without_row_2
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      87
5       Raj   34    Chennai    Male      89

df_drop_rows_3_and_5
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
4  Jitendra   52  Bangalore    Male      87

Original DataFrame after dropping rows 3 and 5
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      81
1     Sailu   35  Hyderabad  Female      76
2      Joel   29  Hyderabad    Male      67
4  Jitendra   52  Bangalore    Male      87

Drop rows using row labels

I am using below data set to demonstrate the examples.

          Age       City  Gender  Rating
Name                                    
Krishna    34  Bangalore    Male      81
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Chamu      35    Chennai  Female     100
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89

Drop a row with row label

Syntax

df.drop[row_label]

Example

df_drop_krishna = df.drop('Krishna')

df_drop_krishna contain below dataset.

          Age       City  Gender  Rating
Name                                    
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Chamu      35    Chennai  Female     100
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89

Drop multiple rows using row labels

Syntax

df.drop[row_label_1, row_label_2]

Example

df_drop_krishna_and_chamu = df.drop(['Krishna', 'Chamu'])

df_drop_krishna_and_chamu contain below dataset.

          Age       City  Gender  Rating
Name                                    
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89

Unless you set the argument ‘inplace’ to True, drop method do not have any impact on original DataFrame. To reflect the changes in original data frame set inplace to True.

df.drop(['Krishna', 'Chamu'], inplace=True)

Find the below working application.

 

drop_rows_using_labels.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Raj"],
        'Age': [34, 35, 29, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [81, 76, 67, 100, 87, 89]}

df = pd.DataFrame(data)
df.set_index('Name', inplace=True)

df_drop_krishna = df.drop('Krishna')
df_drop_krishna_and_chamu = df.drop(['Krishna', 'Chamu'])

print('Original DataFrame')
print(df)

print('\nRow with label "Krishna" is deleted')
print(df_drop_krishna)

print('\nRow with label "Krishna" and "Chamu" are deleted')
print(df_drop_krishna_and_chamu)

df.drop(['Krishna', 'Chamu'], inplace=True)
print('\nOriginal DataFrame after dropping rows with labels "Krishna" and "Chamu"')
print(df)

Output

Original DataFrame
          Age       City  Gender  Rating
Name                                    
Krishna    34  Bangalore    Male      81
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Chamu      35    Chennai  Female     100
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89

Row with label "Krishna" is deleted
          Age       City  Gender  Rating
Name                                    
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Chamu      35    Chennai  Female     100
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89

Row with label "Krishna" and "Chamu" are deleted
          Age       City  Gender  Rating
Name                                    
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89

Original DataFrame after dropping rows with labels "Krishna" and "Chamu"
          Age       City  Gender  Rating
Name                                    
Sailu      35  Hyderabad  Female      76
Joel       29  Hyderabad    Male      67
Jitendra   52  Bangalore    Male      87
Raj        34    Chennai    Male      89


 

Previous                                                 Next                                                 Home

No comments:

Post a Comment