Artificial Intelligence 8 min read

Why Dimensionality Reduction Matters: Unveiling PCA’s Power in Machine Learning

This article explains how high‑dimensional data burdens machine‑learning models, introduces dimensionality reduction as a solution, and walks through the principles, objective function, and eigen‑value based solution of Principal Component Analysis (PCA) while hinting at its extensions.

Hulu Beijing

Nov 29, 2017

Why Dimensionality Reduction Matters: Unveiling PCA’s Power in Machine Learning

Introduction

The universe combines time and space, with theories proposing up to ten dimensions, but in machine learning we face a similar challenge: data often lives in very high‑dimensional spaces, making processing costly and leading to the "curse of dimensionality".

Why Dimensionality Reduction?

Images in computer vision (e.g., a 100×100 RGB picture) generate 30,000‑dimensional feature vectors, and text corpora produce tens of thousands of dimensions in document‑word matrices. Reducing these dimensions to a compact representation preserves essential information while saving resources.

Common Dimensionality‑Reduction Methods

Techniques include PCA, LDA, Isomap, LLE, Laplacian Eigenmaps (LE), and Locality Preserving Projections (LPP). PCA is the most classic, over a century old, and is a linear, unsupervised, global method.

Scenario Description

High‑dimensional feature vectors contain redundancy and noise. By applying dimensionality reduction we can uncover intrinsic data characteristics, improve feature expressiveness, and lower training complexity. PCA is a frequent interview topic.

Problem

Explain PCA’s principle, its objective function, and how to solve it.

Background Knowledge: Linear Algebra

Understanding eigenvalues and eigenvectors of the covariance matrix is essential.

Solution and Analysis

PCA seeks the directions (principal components) that capture the most variance in the data. In a simple 3‑D example where points lie on a plane through the origin, rotating the coordinate system aligns that plane with the x‑y axes, allowing the data to be represented by two dimensions without loss.

In higher dimensions we cannot visualize directly, so we start with a 2‑D illustration. The left image shows centered data points; the right image highlights the principal axis (green line) where variance is maximal. Maximizing the projected variance is the core objective of PCA.

Mathematically, the variance of the projected data equals the eigenvalues of the covariance matrix. The direction with the largest eigenvalue provides the first principal component; the second largest eigenvalue (orthogonal to the first) gives the second component, and so on. This yields the PCA solution procedure.

Further images illustrate the eigenvalue decomposition and projection process.

Summary and Extensions

We explained PCA’s principle, objective function, and solution via eigen‑decomposition. While PCA is linear and classic, it has limitations; kernel PCA (KPCA) and manifold learning methods (Isomap, LLE, LE) extend it to nonlinear scenarios, which will be covered in future articles.

Next Preview

Upcoming: Unsupervised Learning Algorithms and Evaluation, focusing on clustering and correlation techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

PCA linear algebra

Written by

Hulu Beijing

Follow Hulu's official WeChat account for the latest company updates and recruitment information.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.