Wednesday 24 January 2024

Pandas: Count unique values in each column of a DataFrame

Using 'nunique' method, we can find the the number of distinct elements in specified axis.

Example

distinct_column_values = df.nunique()

Above statement return count the distinct column values in a data frame.

 

count_distinct_values_in_a_column.py

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Jitendra', "Krishna"],
        'Age': [34, 35, 234, 35, 52, 34],
        'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
        'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
        'Rating': [67, 43, 67, 100, 41, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)

distinct_column_values = df.nunique()
print('\ndistinct_values\n', distinct_column_values)

print('\nTotal distinct names : ', distinct_column_values['Name'])
print('Total distinct ages : ', distinct_column_values['Age'])
print('Total distinct cities : ', distinct_column_values['City'])
print('Total distinct genders : ', distinct_column_values['Gender'])
print('Total distinct ratings : ', distinct_column_values['Rating'])

Output

Original DataFrame
       Name  Age       City  Gender  Rating
0   Krishna   34  Bangalore    Male      67
1     Sailu   35  Hyderabad  Female      43
2      Joel  234  Hyderabad    Male      67
3     Chamu   35    Chennai  Female     100
4  Jitendra   52  Bangalore    Male      41
5   Krishna   34    Chennai    Male      89

distinct_values
 Name      5
Age       4
City      3
Gender    2
Rating    5
dtype: int64

Total distinct names :  5
Total distinct ages :  4
Total distinct cities :  3
Total distinct genders :  2
Total distinct ratings :  5


Previous                                                 Next                                                 Home

No comments:

Post a Comment