Artificial Intelligence 7 min read

Understanding Linear and Logistic Regression: From MSE to Cross‑Entropy

The article explains linear regression and logistic regression fundamentals, covering loss functions such as mean‑squared error and cross‑entropy, analytic solutions, feature expansion for non‑linear separability, and provides Python code examples to illustrate the concepts.

Data STUDIO

Sep 15, 2025

Understanding Linear and Logistic Regression: From MSE to Cross‑Entropy

Optimization provides a way to minimize loss functions; in deep learning the goal is to reduce training error and, ultimately, generalization error. This article introduces linear models as the foundation of deep neural networks.

Linear regression models predict a continuous target by computing the dot product of a weight vector w and a feature vector X. Examples include predicting salary from job descriptions or supply from inventory variables. Model error is measured with mean‑squared error (MSE), calculated by squaring the difference between each prediction w·x_i and its true value y_i, then averaging over all samples. The MSE can be expressed in vector form.

An analytic solution for the optimal weights is obtained by differentiating the MSE and solving the resulting linear system, which requires matrix inversion. This becomes computationally expensive when the number of features exceeds about one hundred.

For classification tasks, the article introduces logistic regression, which maps the linear combination w·X through the logistic function to obtain a probability of belonging to class 1. Binary and multi‑class scenarios are described.

A Python example generates a two‑dimensional, non‑linearly separable dataset using sklearn.datasets.make_moons and visualizes it with matplotlib. Because the data cannot be separated by a straight line, the article shows how to expand features with polynomial terms via an expand function that creates the vector [x0, x1, x0², x1², x0·x1, 1], enabling linear separation.

The logistic‑regression probability is defined as 1 / (1 + np.exp(-np.dot(X, w))). Using dummy weights uniformly spaced between –1 and 1, the computed probability for the first expanded sample is approximately 0.8679.

Model training minimizes the cross‑entropy loss. The article presents the per‑sample loss formula, the multi‑sample loss expression, and a compute_loss function that implements the average cross‑entropy over a dataset. With the dummy weights, the computed cross‑entropy value is about 1.0524.

In conclusion, regression problems involve MSE and analytic solutions that are often impractical for high‑dimensional data, while logistic regression finds optimal parameters by minimizing cross‑entropy.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning Python logistic regression linear regression scikit-learn cross entropy mean squared error

Written by

Data STUDIO

Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.