Standardization normalizes the data such that it has a mean of 0 and the standard deviation of 1. This is achieved by subtracting the mean from all the values and then dividing by the standard deviation.
X_normalized = (X - mean(X)) / std(X)
How to standardize Pandas DataFrame?
Step 1: Create a StandardScaler instance
scaler = StandardScaler()
Step 2: Use 'fit_transform'. method of StandardScaler object to transform the data.
# Select numeric columns numeric_columns = df.select_dtypes(include=['float64', 'int64']).columns # Transform the selected columns standardized_values = scaler.fit_transform(df[numeric_columns])
‘fit_transform’ method return an ndarray object.
Find the below working application.
standardization.py
import pandas as pd from sklearn.preprocessing import StandardScaler # Create a sample DataFrame data = { 'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9], 'D' : [True, False, False] } df = pd.DataFrame(data) # Create a StandardScaler instance scaler = StandardScaler() # Select numeric columns numeric_columns = df.select_dtypes(include=['float64', 'int64']).columns # Transform the selected columns standardized_values = scaler.fit_transform(df[numeric_columns]) print(f'type of standardized_values : {type(standardized_values)}') print(f'standardized_values : \n{standardized_values}')
Output
type of standardized_values : <class 'numpy.ndarray'> standardized_values : [[-1.22474487 -1.22474487 -1.22474487] [ 0. 0. 0. ] [ 1.22474487 1.22474487 1.22474487]]
Previous Next Home
No comments:
Post a Comment