Using 'nunique' method, we can find the the number of distinct elements in specified axis.
Example
distinct_column_values = df.nunique()
Above statement return count the distinct column values in a data frame.
count_distinct_values_in_a_column.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Krishna"],
'Age': [34, 35, 234, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [67, 43, 67, 100, 41, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
distinct_column_values = df.nunique()
print('\ndistinct_values\n', distinct_column_values)
print('\nTotal distinct names : ', distinct_column_values['Name'])
print('Total distinct ages : ', distinct_column_values['Age'])
print('Total distinct cities : ', distinct_column_values['City'])
print('Total distinct genders : ', distinct_column_values['Gender'])
print('Total distinct ratings : ', distinct_column_values['Rating'])
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 67 1 Sailu 35 Hyderabad Female 43 2 Joel 234 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Jitendra 52 Bangalore Male 41 5 Krishna 34 Chennai Male 89 distinct_values Name 5 Age 4 City 3 Gender 2 Rating 5 dtype: int64 Total distinct names : 5 Total distinct ages : 4 Total distinct cities : 3 Total distinct genders : 2 Total distinct ratings : 5
No comments:
Post a Comment