We can
fill the missing values in a DataFrame using fillna method.
Example
df_without_missing_values = df.fillna(0)
Above snippet replace all the missing values with 0 and assign the dataset to the variable ‘df_without_missing_values’, but do not affect the original DataFrame.
To update the changes in original DataFrame, you can set the argument inplace to True.
df.fillna(0, inplace=True)
Find the below working application.
fill_missing_values.py
import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {
'A' : [1, 2, np.nan, 3, 4, np.nan, 5],
'B' : [1, np.nan, np.nan, 3, 4, 5, 6],
'C' : [1, 2, np.nan, 4, 5, np.nan, 6],
'D' : ['a', 'b', None, None, 'c', 'd', 'e']
}
df = pd.DataFrame(data)
df_without_missing_values = df.fillna(0)
print('Original DataFrame')
print(df)
print('\nDataFrame by filling with 0')
print(df_without_missing_values)
df.fillna(0, inplace=True)
print('\nOriginal DataFrame by filling with the argument inplace=True')
print(df)
Output
Original DataFrame A B C D 0 1.0 1.0 1.0 a 1 2.0 NaN 2.0 b 2 NaN NaN NaN None 3 3.0 3.0 4.0 None 4 4.0 4.0 5.0 c 5 NaN 5.0 NaN d 6 5.0 6.0 6.0 e DataFrame by filling with 0 A B C D 0 1.0 1.0 1.0 a 1 2.0 0.0 2.0 b 2 0.0 0.0 0.0 0 3 3.0 3.0 4.0 0 4 4.0 4.0 5.0 c 5 0.0 5.0 0.0 d 6 5.0 6.0 6.0 e Original DataFrame by filling with the argument inplace=True A B C D 0 1.0 1.0 1.0 a 1 2.0 0.0 2.0 b 2 0.0 0.0 0.0 0 3 3.0 3.0 4.0 0 4 4.0 4.0 5.0 c 5 0.0 5.0 0.0 d 6 5.0 6.0 6.0 e
Fill the missing values in a specific column
df['City'] = df['City'].fillna('not_found')
Above statement fill the missing values in ‘City’ column with the value ‘not_found’
df['Age'].fillna(0, inplace=True)
Above statement fill the missing values in ‘Age’ column with the value 0.
Find the below working application.
fill_missing_values_in_a_column.py
import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Ram', 'Joel', 'Gopi', 'Jitendra', "Raj"],
'Age': [34, np.nan, 29, 41, 52, np.nan],
'City': ['Bangalore', None, 'Hyderabad', None, 'Bangalore', 'Chennai']}
df = pd.DataFrame(data)
print(df)
df['City'] = df['City'].fillna('not_found')
df['Age'].fillna(0, inplace=True)
print('\nAfter filling missing values')
print(df)
Output
Name Age City 0 Krishna 34.0 Bangalore 1 Ram NaN None 2 Joel 29.0 Hyderabad 3 Gopi 41.0 None 4 Jitendra 52.0 Bangalore 5 Raj NaN Chennai After filling missing values Name Age City 0 Krishna 34.0 Bangalore 1 Ram 0.0 not_found 2 Joel 29.0 Hyderabad 3 Gopi 41.0 not_found 4 Jitendra 52.0 Bangalore 5 Raj 0.0 Chennai
Previous Next Home
No comments:
Post a Comment