Using 'nunique' method, we can find the the number of distinct elements in specified axis.
Example
distinct_column_values = df.nunique()
Above statement return count the distinct column values in a data frame.
count_distinct_values_in_a_column.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Krishna"],
'Age': [34, 35, 234, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [67, 43, 67, 100, 41, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
distinct_column_values = df.nunique()
print('\ndistinct_values\n', distinct_column_values)
print('\nTotal distinct names : ', distinct_column_values['Name'])
print('Total distinct ages : ', distinct_column_values['Age'])
print('Total distinct cities : ', distinct_column_values['City'])
print('Total distinct genders : ', distinct_column_values['Gender'])
print('Total distinct ratings : ', distinct_column_values['Rating'])
Output
Original DataFrame
Name Age City Gender Rating
0 Krishna 34 Bangalore Male 67
1 Sailu 35 Hyderabad Female 43
2 Joel 234 Hyderabad Male 67
3 Chamu 35 Chennai Female 100
4 Jitendra 52 Bangalore Male 41
5 Krishna 34 Chennai Male 89
distinct_values
Name 5
Age 4
City 3
Gender 2
Rating 5
dtype: int64
Total distinct names : 5
Total distinct ages : 4
Total distinct cities : 3
Total distinct genders : 2
Total distinct ratings : 5
No comments:
Post a Comment