Artificial Intelligence 7 min read

Are Projected Points Still Linearly Separable? SVM Insight & Proof

This article examines whether points from two linearly separable classes remain separable after being projected onto the SVM decision hyperplane, providing geometric and convex‑optimization proofs along with illustrative diagrams and references for deeper study.

Hulu Beijing
Hulu Beijing
Hulu Beijing
Are Projected Points Still Linearly Separable? SVM Insight & Proof

Introduction

Support Vector Machine (SVM) is a powerful supervised learning method that appears in virtually every classic machine‑learning textbook and is a frequent topic in technical interviews. The first interview question about SVM typically tests understanding of the model’s derivation.

Problem

Given two linearly separable classes of points in space, after projecting each point onto the SVM decision hyperplane, are the projected points still linearly separable? Provide a proof of your view.

Answer and Analysis

Linear separability means a hyperplane can completely separate the two classes. In the diagram below the blue line represents the SVM decision hyperplane that separates the red and green point clouds. Projecting the points onto this hyperplane (as shown on the right) places them on the hyperplane itself.

At first glance the projected points appear not to be linearly separable. A simple counter‑example uses only two points, one from each class. The SVM hyperplane is the perpendicular bisector of the segment joining the two points; after projection both points collapse onto the same location on the hyperplane, making separation impossible.

In fact, for any linearly separable dataset, the projections onto the SVM hyperplane are always linearly inseparable. A proof by contradiction considers the case where only support vectors exist. Assuming the projected points are still separable would imply the existence of a hyperplane (the dashed blue line) that separates them better than the original SVM hyperplane, contradicting the optimality of the SVM solution.

The proof can be tightened using the KKT conditions of the SVM dual problem. From the KKT conditions, any data point with a non‑zero Lagrange multiplier must be a support vector; all non‑support vectors have multiplier zero. Consequently, the classification decision depends only on support vectors, which explains SVM’s computational efficiency.

Alternatively, the separating hyperplane theorem from convex optimization offers a concise argument. The SVM hyperplane is the perpendicular bisector of the shortest segment connecting the convex hulls of the two classes. Projecting each convex hull onto this hyperplane collapses the shortest segment, so the projected sets cannot be separated by any hyperplane.

For interviewees, presenting the geometric contradiction is sufficient; mentioning the convex‑optimization perspective demonstrates deeper understanding.

Further Reading

SVM derivation notes: http://cs229.stanford.edu/notes/cs229-notes3.pdf

Dual problem and KKT conditions: http://stanford.edu/class/ee364a/lectures/duality.pdf

Separating hyperplane theorem: http://www.princeton.edu/~amirali/Public/Teaching/ORF523/S16/ORF523_S16_Lec5_gh.pdf

Machine LearningSVMconvex optimizationhyperplane projectionlinear separability
Hulu Beijing
Written by

Hulu Beijing

Follow Hulu's official WeChat account for the latest company updates and recruitment information.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.