Fundamentals 11 min read

Mastering Numeric Feature Scaling: 4 Techniques with Scikit‑Learn

This article explains why numeric feature engineering is essential for machine learning, outlines the challenges of differing scales and outliers, and demonstrates four preprocessing methods—Standardization, Robust Scaler, Power Transformer, and Normalization—using the California housing dataset with detailed code examples and visual analysis.

Data Party THU

Apr 9, 2026

Mastering Numeric Feature Scaling: 4 Techniques with Scikit‑Learn

Numeric feature engineering is a crucial preprocessing step in machine learning, addressing two core problems: disparate feature magnitudes and outliers. Using the California housing dataset, we illustrate four common scaling techniques and discuss when each should be applied.

Standardization (StandardScaler)

Standardization transforms features to zero mean and unit variance, making them comparable for algorithms that assume a normal distribution (e.g., linear regression, SVM, PCA). It is highly sensitive to outliers.

standard_scaler = StandardScaler()
standardized_x = standard_scaler.fit_transform(X)

After scaling, MedInc lies in roughly [-2, 4] and Population in [-1, 4], but extreme values still dominate the mean.

Robust Scaler

RobustScaler replaces mean and standard deviation with median and inter‑quartile range (IQR), reducing the influence of extreme outliers while keeping them in the data.

robust_scaler = RobustScaler(quantile_range=(25.0, 75.0), with_scaling=True, with_centering=True, unit_variance=True)
robust_x = robust_scaler.fit_transform(X)

The main data for both features now falls into a tighter interval (e.g., MedInc ≈ [-2, 5], Population ≈ [-2, 6]).

Power Transformer

PowerTransformer (e.g., Yeo‑Johnson) compresses long tails, turning a right‑skewed distribution into a near‑normal shape while preserving the information of extreme values.

from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method='yeo-johnson')
pt_transformed = pt.fit_transform(X[:, [1]])

Histograms before and after transformation show a clear shift from a skewed to a bell‑shaped distribution.

Normalization (Min‑Max Scaler)

Normalization rescales features to the [0, 1] interval, which is essential for distance‑based algorithms (e.g., KNN) and helps neural networks avoid saturated activations.

min_max_scaler = MinMaxScaler()
normalized_x = min_max_scaler.fit_transform(X)

While Population 's maximum maps to 1.0, the majority of values are compressed into a narrow range (0‑0.16), illustrating the method’s sensitivity to extreme outliers.

When to Use Each Scaler

.----------------------.---------------------------.-------------------------------------------------------.
|       Issue          |        Best Tool          |                         Why?                         |
:----------------------+---------------------------+-------------------------------------------------------:
| Different Scales     | StandardScaler            | Makes features comparable.                           |
| Heavy Skew           | Power/QuantileTransformer | Normalizes the distribution shape.                  |
| Extreme Outliers     | RobustScaler              | Uses median and IQR, unaffected by marginal outliers.|
| Neural Network Input  | Min‑Max Scaler            | Matches the "expected" range of neurons.            |
'----------------------'---------------------------'-------------------------------------------------------'

Remember to call .fit() only on training data to avoid data leakage; use .transform() on training, validation, and test sets.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

feature scaling Normalization scikit-learn numeric preprocessing power transformer robust scaler

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.