At high level, there are 5 phases in Machine learning process.
- Data pre-processing
- Data modelling
- Evaluation of the model
- Model Deployment
- Model monitoring
1. Data pre-processing
Data pre-processing the core step in machine learning process. Data pre-processing transforms the raw data to the format that can be consumed in model training (data modelling) phase.
Following are the common steps in Machine learning process.
a. Data Collection
This step collect all the data that is required to perform a machine learning task. This step might collect the data from various sources like files, Data bases, apis, internet etc.,
b. Data Cleaning
This step clean the data by performing various tasks like removing null values, removing duplicate values, populate missing values etc.,
c. Data Integration
Most of the cases, the data is spread/collected from multiple data sources like files, databases, APIs etc., Data Integration step integrate the data across multiple data sources into a single dataset.
d. Data transformation
Transform the data into a format that machine learning algorithms understand. For example, if your model is dealing with text data, this phase might convert the text to lower case, remove the stop words, removing special characters like punctuation symbols etc.,
e. Data splitting
This step split the pre-processed data into training, testing sets. Training sets are used in Machine learning model training, and testing sets are used in evaluating data model. In general, we use 80% of the data as training set and 20% as test set.
f. Feature Engineering
This step create new features from the existing ones from the raw data. For example, if the raw dataset contain information about a person height and weight, you can construct body mass index (BMI) from them.
g. Data Normalization
This step rescales the numeric features of a dataset to a standard range or distribution. For example, min-max scaling scales the data to a specific range between 0 and 1.
2. Data modelling
Data modelling phase takes the pre-processed data and create a machine learning model from it.
Following are the steps involved in Data modelling phase.
a. Model selection
This step selects the appropriate model or algorithm that suits to address your use case. Selecting the right algorithm plays a vital role in Machine learning process.
b. Model training
Once the algorithm is selected, we need to train the model using sample/training dataset.
c. Make predictions
Start making predictions using the trained model
3. Model evaluation
This step evaluates the model performance. Following are the ley steps in this phase.
a. Asses the performance of the model
Evaluate the model with test dataset and compare the outcome against actual values.
b. Model tuning
Tune the model parameters to optimize the performance.
4. Model deployment
Deploy the model to make the predictions on completely new/unseen data.
5. Model monitoring and maintenance
Continuously monitor the model, and retrain the model periodically when the new data sets available.
Previous Next Home
No comments:
Post a Comment