Member-only story

Machine Learning 101 P1: Feature Scaling with Python

Hang Nguyen
2 min readDec 14, 2024

--

There are 3 main stages in any machine learning process:

  • Data pre-processing
  • Data modeling
  • Evaluation ML model

Feature scaling is a part of data pre-processing stage.

What is feature scaling in data pre-processing stage?

This is a method used to normalize or standardize the range of independent variables/features to avoid the unnecessary dominance of one feature over another existing in raw data. Feature scaling is always applied to columns.

There are several feature scaling techniques, some of the most widely known include (note that these techniques are not the same):

  • Normalization: used when almost (all) features follow normal distribution.
  • Standardization (Z-score Normalization): works all the time :)
  • Min-max scaling
  • Absolute maximum scaling

Feature scaling before or after splitting dataset into training and test sets

The idea of splitting dataset into training set and test set is to make use of training set to train the ML model and then test set to evaluate the ML model. Feature scaling will manipulate the data in order to set all values of features in the same range. If we perform feature scaling before splitting the dataset…

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet